A Plea for Configuration Management Sanity
I’ve taken configuration management and revision control for granted. I first lost work because I wasn’t using a repository system early in life, and I resolved that such a thing would not happen again. During my senior year in college, I was working on a project with a randomly assigned classmate. When we started, I explained I’d set up a CVS repository for the code and gave him the directory for CVSROOT. Next time we met, he told me he didn’t have any trouble finding the code, but asked why the file names had commas and a couple random letters at the end of them. In retrospect, it was wrong of me to laugh, but I honestly thought he was making a joke with a deadpan delivery. When I walk onto a project, I can make an educated guess about how it’s going to go based on how much work it takes to get up and running with the SCM and build system.
Getting to know a project is akin to meeting a person. Whether consciously or not, I form a first impression based on a handful of rules of thumb accumulated over the years. One rule of thumb is that it should take no more than a day to have a functioning copy of the source code, build environment, and automated test setup. If it takes longer - hunting through branches for the right version of the code, tracking down library versions, remembering which tags actually need to be checked out, or wrestling with a tool chain that was never standardized - then I know I have a lot of remedial work ahead of me.
I’m not including time to hassle IT folks to get an account created or setting up test environments with device hardware in the loop. That’s a separate issue for another blog post. This is just getting the productivity loop going: check out code, write tests, write code, build, test, declare victory.
There’s no magic to a successful CM system. The right things to do are written up in a thousand FAQs, user manuals, and technical books. Here are a couple that have been on my mind of late.
First, standardize your tool chain. Decide what editor, build system, compiler, linker, and testing tools are used on the project. Keep installers for those programs around, unless they’re included in the operating system on your development systems. If you’re really zealous about this, you could even standardize the operating system and revision you use on your development platforms.
Programmers are individuals; doubtless someone will protest that they absolutely must use their favorite editor and command line interface to the revision control system or it will completely sap their productivity and make small children cry. You have two good options. Either refuse to budge, or tell them they’re welcome to substitute as long as no one else on the project can tell the difference. For example, if you standardize on Eclipse, a skilled emacs user will be able to edit code without anyone knowing the difference. In my book, that’s okay. However, if someone decides to replace the Makefiles with Ant build.xml files, then everyone on the project will see the difference: either everyone will have to convert to using Ant, the two build systems will have to be maintained in parallel, or at the very least the files for the redundant build systems will have to be stored in the repository. Those are very noticeable changes.
Next, put the libraries upon which your program depends under revision control. There used to be good reason not to do this. SCM systems would get bogged down handling large binary files. With modern server hardware and SCM software, however, this just isn’t a concern for most projects. The advantage is that someone checking the code can check out the libraries as well and be assured that everything necessary to build the project is there. If you still can’t do this for some reason, keep a standard set of library files outside your SCM system, and keep a tight reign on them. Basically, if you don’t have the tool support you, you’ve decided to do your own manual configuration control of the libraries.
It should also be easy to figure out how to check out the source code. By that, I mean that it should be clear what branches and tags are applicable for a given development task. There are lots of guides on how to do this, and in recent days, they haven’t even conflicted that much. Whatever repository layout you choose, just be sure that it’s easy to find where new code is being developed, old releases are being maintained, and particular side projects are being carried out.
I have to take a moment to rant against branches. Everyone’s been very excited about lightweight branches, lately. We’ll have a branch for the released code, one for each maintenance release, one personal sandbox for each programmer, another sandbox for each different task the programmer is working on, another branch for demos to visitors. It’s the branch-making drinking game: “Every time someone says we need to be more agile, create a new branch in the repository!” I’ll admit, the ability for recent SCM tools to support lightweight branching and merging is pretty interesting. What seems to be missing is the cognitive support to developers to navigate the branches that they create. The tragic result is a repository full of branches, where no one is sure what branch is used for what purpose. My best suggestion is to use branches - they are helpful - but use them only as necessary and with a clear naming scheme, so that there’s no confusion as to what goes where.
When populating the revision control system, don’t forget about test artifacts. If your product needs startup or configuration data, check it in. If the tests need test data, check it in. I shouldn’t have to go hunting through the tape archive of a programmer who got canned three years ago, trying to find his test data to determine whether his code (which has been sitting in the repository untouched for those three years) ever actually worked right according to his own unit tests. These unit tests, as you may have guessed, should also be checked into the repository.
Bottom line, everything necessary to build the software ought to go into the repository, and it should go in organized such that it’s ready to build when pulled back out. Now that this rant is aside, perhaps we can get back to working on solving the real problems.







