The Build Pipeline

All the tasks involved in building and deploying a software application take a measurable amount of time. As a project to build a kick-ass piece of software matures, the length of each of the steps required to build it take longer and longer; there are more tests to run, more files to compile and more stuff to compress into the archive.

Initially, everyone tries to run each of the steps on their developer workstation, and for a time this approach works perfectly well. Eventually however, some stressed-out, caffeine-fuelled developer will snap and point out that they'd rather not spend half an hour watching their machine churn through the tests and create WARs, damn it! They'd much rather get on with the next thing on their list. And looking around the room, just about everyone on the project agrees with them. How do we speed up the build?

Actually, how do we speed up the build?

There are two ways: the clever and the dumb, and because we're clever software engineers we always reach for the lever with "clever" written on it first. So, we spend time profiling the build, finding bottlenecks, changing compilers, getting faster hard-disks, better file systems and generally optimising the heck out of as much stuff as possible. We probably have some nice coloured charts showing us what time is being spent where. It's all good fun and games, and is probably necessary work as we uncover inefficiencies in our processes.

And it makes no difference.

Over time, the build continues to get longer. Maybe each of the steps is now lightning fast, but we keep on adding stuff as more and more functionality is developed, and even though we're going as fast as we can and the build workstation is now CPU-bound, all this additional work takes its toll. So we reach for the lever with "dumb" written on it, but now we look closer, it doesn't say "dumb", it says "smarter".

How do we work smarter? One way to make the build to go faster would be to do less of it in the first place. You know, do less compilation, run fewer tests, put less into our archives. But don't we need all those steps we're thinking of not doing? After all, we've optimised the build to heck (an outer suburb of hell) and back again. We can't "not do" bits of it. You're right, we do have to do all those steps, but do they all need to be on the developer's machine? If we're smart, we're already using some sort of continuous integration server somewhere. Couldn't that take some of the strain?

So, we start to break the build up into chunks. The "pre-checkin" build runs in a reasonable amount of time, say about five minutes, and tries to cover off the major areas of risk. It ensures that the code compiles and that the "high risk" and "high value" tests pass. The "high risk" tests are those that are in the area of the application that's currently being worked on, or have been shown to be very fragile before, and the "high value" tests are those that provide a bellwether that everything is working as it should be. The build machine obviously runs everything.

And for a time, this strategy works. The developers are happy because they get to check-in more frequently, and the build machine, being the biggest machine in the room, breezes through running everything pretty quickly. But soon we're starting to lose fast feedback: the build on the CI server is taking too long, and no-one can check-in when the build's broken. We need to make things go faster again. But how?

We could always start to break the build up on the CI server. Perhaps we could run the risky stuff first, or do the steps that take longer sooner rather than later. Once we've done this, all sorts of possibilities open up. Perhaps it would be possible to have more than one build machine going at the same time, each working on a different part of the build? Perhaps we could arrange the parts of the build so that we get feedback on failures as quickly as possible? Perhaps, perhaps, perhaps...?

And this is how we end up with a build pipeline. Later stages tend to run slower than earlier stages, but everything is arranged to provide fast feedback. We know that we only have to manually test the builds that make it through the far end of the pipeline, and as we add stages we can add confidence that the application works as expected. If we're smart (quick! Pull the "clever" lever again!) we can automatically deploy the app into increasingly realistic environments and run tests against it as part of the pipeline, which is something we'd never imagine doing from the developer workstation.

There are a few things that you can do to make the build pipeline easier to work with, but the key is that the first stage should be responsible for building all the artifacts required by later stages. This reduces the amount of time wasted in each stage by repeated checkouts and compilations since all they need to do is unpack a (possibly uncompressed) archive. It also means that should there be a problem with the build, a developer can grab the archive and start digging into exactly what was running on the build machine. For a .Net project, this leads you down a path that ends with the ability to do an xcopy deploy of your application. This is a Good Thing.

Typically, you don't include the source code inside the archive, though you might have some record of the source control tag required to get it. A side-effect is that those builds that have a knowledge of what the source tree looks like need to provide a mechanism for overriding that knowledge, and they shouldn't count on it being there. One of the build experts at ThoughtWorks has spent the past few weeks drumming this lesson into me, and I'm glad to say that it's finally sinking in. We've settled on allowing sensible defaults be overridden using system properties on our Java project, but there's normally something similar for other technologies.

I'm a fan of build pipelines. Try them. You might be too.

Update: Julian Simpson, one of ThoughtWorks' build and systems experts and an all-round nice bloke, responds, pointing out some of the problems that an over-enthusiastic adoption of a build pipeline might have.


Simon Stewart on Saturday, 28 July, 2007

Posted in: /tech

You may comment...


Categories