2010-01-24  Continuous Integration for One

Executive summary : It's a good thing, but not for the reasons you expect

Like many of my programming chums, I believe in the importance of the build. Breaking the build is a bad thing. Knowingly breaking the build is, well I'm not a religious man, but it's a sin.

Sometimes the build goes wonky or tests start failing because of mishap or misfortune. Someone makes a change to one place in the code, somebody else makes a change in another part, something goes awry and oopsy. Of course, tools can help us here. If you've got something that regularly checks your source code repository for changes and rebuilds everything as soon as it does, then you can find out if there's a problem sooner rather than later.

These tools have a name, continuous integration servers, and are so called because they automate the process of continuous integration as described by Martin Fowler. You can put something together yourself pretty easily, but there are a ton of them available. We've been using CruiseControl, one of the more widely used CI servers, to reasonable effect over the past few months of pretty intense a development.

We want to push our automated build a little bit further. For various reasons that I haven't the energy to go into here, our environment is awkward. Putting a release up into the test servers is complex and still involves a certain amount of time and a degree of manual intervention. Consequently, it probably doesn't happen as often as it should. We're in a quieter period at the moment, but when things kick off again it's going to be a problem. What we'd like to do is finish automating that process, and having done that get our CI server to build and deploy nightly into our test environment. With that complete, we're going to try and kick off the gui testing scripts as well. Our testers can spend more time with their feet on the desk drinking tea, while the developers can be secure that yesterday's work does actually work.

Anyway, it's fallen to me to actually do this. One of the things I need to do was spent a bit of time fiddling with the build server. CruiseControl, while a sound piece of kit, doesn't lend itself to easy fiddling. It's driven by configuration files and every time you change the configuration you need to bounce the server, if you get something wrong it's not necessarily obvious why, and that just a faff and a time consuming one. By coincidence, several people on the accu-general mailing list had been discussing the Hudson build server. One thing that stuck in my mind was that Hudson could be configured entirely though its GUI. And so I gave it a try. I'm slightly ashamed to say that's the reason why, but I'm glad I did. Hudson is extremely easy to fiddle with - setting up builds, chaining different builds together, tool integration, and so on, are all straightforward and can be done via a web browser. I doubt there's anything it can do that can't be done with CruiseControl, but for exploratory work it's streets ahead.

Because I was in a poking about mode, I installed Hudson on the my own server. Working against the machine sitting on the floor by my right ankle was just easier than working against a machine several steps away on the other side of a Citrix connection that isn't exactly snappy and which occasionally slows to a treacly crawl. So I had Hudson and I needed some code. Fortunately, I had some. From installation to building the Mango library and running its tests, via installing a plugin to pull code from Bazaar repositories, took about five minutes.

I poked around Hudson a bit more and found a setting to draw a chart of test results. I can see why such a thing is useful in a team situation, but Mango is one man project. I build it and run the tests whenever I work on it. Poking some more, I found another setting that drew a chart of compiler warnings. That's when I had a little revelation.

Continuous integration servers are touted because they'll let you know, and know quickly when the build has broken. And that's true and it's a real benefit. But it doesn't apply for a one man project. In fact, I'm not even sure it's true for most multi-person projects either. What CI servers can do for the one man project is remember. Remember your test results. Keeping track of build times. Remember your compiler warnings. Stuff you wouldn't yourself record unless you were an outrageously meticulous record keeper.

As it happened, Mango had no compiler warnings in the build. However I had other code that did. Written in Java several years ago, I hadn't had reason to work on it for quite some time. Building with a modern Java compiler produced a slew of warnings, which I'd not had the inclination to fix. Hooking up Hudson to build it, within moments I had a chart telling me exactly how many warnings I had - seventy - together with breakdowns of warnings by type and class name, hyperlinked into the source.

As I explored the warnings report, some of them looked very simple to fix. So I fixed them and committed the changes. So Hudson rebuild the code and sent me an email telling me it had. That prompted me to look at the Hudson build console, where I saw the warnings graph had gone down a little. Which prompted me to fix another warning. You can probably fill in the rest.

By remembering what had happened in previous builds and then telling me in an easy to understand way, Hudson prompted me to fix a codebase that had languished for years. The second spike on the chart above is where, having sorted out everything in the main code, I turned on warnings in the test code.

The psychological effect of that graph was really remarkable. I decided to make deliberate use of it. When building Arabica I use GCC's default warnings. Recently I've been exchanging emails with Ash Berlin, who's using Arabica to provide XML support for the Flusspferd Javascript project. Flusspferd build at a higher warning level than I do and Ash had sent me a few patches to silence various things he was seeing. Setting up Hudson to build Arabica took, again, only a couple of minutes even though it's a C++ project using Autotools rather than a Java project using Ant. I cranked up the warning level. 600 hundred warnings! Blimey!

I suspect that Hudson's GCC warnings parser isn't quite as refined as the Java warnings parser, so I don't believe that was the true number. Neverhteless it was still a lot and rather more that I was expecting. They were gone in 15 builds. All of them, gone. I didn't have to make 600 individual fixes - a change in an include file can wipe out several warning messages at a stroke - but I think that's pretty quick. And I know I'll keep those warnings at zero too.

When I started looking at Hudson, I was just after something I could easily fiddle around with while working on our build and deploy. I didn't expect that it would creep into my brain and change the way I worked on my individual projects.


[Add a comment]

Jez Higgins