We currently are using CCNet as our continous integration server. Most projects check for changes every 30 seconds (the default) and if needed perform a build (unit tests, stylecop, fxcop, etc).
We’ve gotten quite a few projects now, and the server spends most of its time near 100% cpu utilization. This has alarmed some of the development team, even though the server is responsive and builds are still about the same length of time they’ve always been.
Its been suggested that we lower the check interval to about five minutes. To me that seems too long, and we risk people committing code and then going home for the weekend and now there’s a broken build possibly holding up others. In response, the suggestion is that if someone needs to know the results they can force the build. But that seems to defeat the purpose of CI, as I thought it was supposed to be automated.
My proposed solution is just to get another build server and split the builds amongst the servers.
Am I thinking about this the wrong way, or is there a point where if integration isn’t often enough you’re not really doing CI anymore?
7
The CPU is supposed to be near 100% utilization. Otherwise you’re just wasting CPU. Same with memory.
Continuous means as frequently as you can manage. If your server is maxed out, then crank out the builds as fast as it will go. It might sometimes get busy enough to drag your response time down, but if you lower the check interval, your response time will always be dragged down. It makes no sense to try to “conserve” idle time on your server. If your actual response time frequently drops to unacceptable levels, then you add additional servers.
4
You should not be checking for changes – let the source control system ‘push’ checkin notifications to the build system instead (you may need to get a new SCM or build system, I seriously recommend Jenkins)
So, obviously this would stop a lot of problems on the server.
The next problem is how much checking in goes on? If people are committing every 5 minutes, it isn’t going to help much. Then the last question is – how long do these builds take? If you’re spending 20 minutes building a checked-in component and running tests and analysis and creating installer packages, and it depends on others, are those built automatically too, even though they haven’t changed?
Monitor the build system to see what it is doing. I’ve found that it is easy to set up a build server that does far too much building that is necessary. That woudl be the first step I’d take.
6
In my team, our CI configuration enables us to trigger a job each time changes are detected in the SCM; this is what continuous means to us.
If you have the resources, consider implementing a distributed CI system to offload the workload to multiple servers and keep resource usage to a minimum on the master.
We have multiple jobs for doing specific things. Subsequently, we do not need to trigger all of the jobs, per se. Depending on where the changes were made, we may trigger one or more jobs as needed.
Because the jobs are focused on a specific task and specialized, they generally complete execution in a reasonable time.
As already noted, the best practice solution is to configure svn with a post-commit hook so it will notify hudson when changes have been committed. This reduce stress on the SCM and will still facilitate CI.
1
The first thing that seems odd is that you’re polling for changes, rather than receiving change notifications. I’ve never used CruiseControl .NET, but the other continuous integration tools that I’ve used supported receiving a notification or trigger when the integration branch was updated. This seems more efficient, since you won’t be wasting cycles checking for updates that don’t exist (I would suspect this would reduce power consumption). If, after this, your build/test cycle in your CI environment is too slow or queues of work are building up, then I’d consider moving to a multiple server environment.
The next thing I would recommend is ensuring some level of confidence that what your CI tools build will always build. Depending on the size and complexity of the system, it might not make sense to build and then run every test in the development environment. However, prior to merging, the developer should at least build the system to ensure that the tests can be run. It might also be a good idea to have smoke tests that hit key areas of the system, but that can be run quickly in a development environment – if they don’t pass, the developer doesn’t merge changes. Having a more extensive level of unit testing over only changed components is also an option. Just leave the extensive testing to continuous integration, while maintaining sufficient confidence that you’ll be left with a good build after the merge.
To me, the idea that one person not being in the office is a problem is also part of the problem. This problem would be partially reduced by what I described above – developer build and test cycles to decrease problems discovered in continuous integration. However, making sure that more than one person can fix issues discovered in parts of the system are also better. Consider code reviews, pair programming, or having people work in different subsystems to spread knowledge around.
I think that very rarely is “throw more technology at the problem” an appropriate solution. Look at how you are using the technology first, along with the goals you want to achieve. It’ll probably end up being most cost-effective in the long run.