Traditionally we performed code review before commit, I had an argument with my colleague today, who preferred code review after commit.
First, here’s some background,
- We have some experienced developers and we also have new hires with almost zero programming experience.
- We’d like to perform fast and short iterations to release our product.
- All team members are located at the same site.
The advantages of code review before commit I’ve learned:
- Mentor new hires
- Try to prevent errors, failures, bad designs early in the development cycle
- Learn from others
- Knowledge backup if someone quits
But I’ve also had some bad experiences:
- Low efficiency, some changes may be reviewed over days
- Hard to balance speed and quality, especially for newbies
- One team member felt distrust
As to post-commit review, I know little about this, but the thing I’m most worried about is the risk of losing control due to lack of review.
Any opinions?
UPDATE:
- We’re using Perforce for VCS
- We code and commit in the same branches (trunk or bug fixing branches)
- To improve efficiency, we’ve tried to split code into small changes. We’ve also tried some live dialog review, but not everyone followed the rule. This is another problem though.
4
Like Simon Whitehead mentions in his comment, it depends on your branching strategy.
If the developers have their own private branch for development (which I’d recommend in most situations anyway), I’d perform the code review prior to merging with the trunk or main repository. This will allow developers to have the freedom to check in as frequently as they want during their development/testing cycle, but any time code goes into the branch that contains the delivered code, it gets reviewed.
Generally, your bad experiences with code reviews sound more like a problem with the review process that have solutions. By reviewing code in smaller, individual chunks, you can make sure it doesn’t take too long. A good number is that 150 lines of code can be reviewed in an hour, but the rate will be slower for people unfamiliar with the programming language, the system under development, or the criticality of the system (a safety critical requires more time) – this information might be useful to improve efficiency and decide who participates in code reviews.
10
There is a mantra that no one seems to have quoted yet: Check in early and often:
Developers who work for long periods — and by long I mean more than a day — without checking anything into source control are setting themselves up for some serious integration headaches down the line. Damon Poole concurs:
Developers often put off checking in. They put it off because they don’t want to affect other people too early and they don’t want to get blamed for breaking the build. But this leads to other problems such as losing work or not being able to go back to previous versions.
My rule of thumb is “check-in early and often”, but with the caveat that you have access to private versioning. If a check-in is immediately visible to other users, then you run the risk of introducing immature changes and/or breaking the build.
I’d much rather have small fragments checked in periodically than to go long periods with no idea whatsoever what my coworkers are writing. As far as I’m concerned, if the code isn’t checked into source control, it doesn’t exist. I suppose this is yet another form of Don’t Go Dark; the code is invisible until it exists in the repository in some form.
…If you learn to check in early and check in often, you’ll have ample time for feedback, integration, and review along the way…
I’ve worked for a couple companies that had different approaches towards this. One allowed it, as long as it didn’t prevent compiling. The other would freak out if you checked in any bugs at all. The former is much preferred. You ought to be developing in a kind of environment that would allow you to collaborate with other people on things that are still in progress, with the understanding that it will all be tested later.
There is also Jeff Atwood’s statement: Don’t be afraid to break things:
The most direct way to improve as a software developer is to be absolutely fearless when it comes to changing your code. Developers who are afraid of broken code are developers who will never mature into professionals.
I would also add that for peer reviews, if someone wants you to change something, having the history of your original version in source is a very valuable learning tool.
3
I’ve recently started doing pre-commit reviews in a project I’m in and I must say I’m pleasantly surprised about how unproblematic it is.
The biggest drawback of post-commit reviews that I see is that it’s often a single-person-only afair: Someone reviews the code for correctness, but the author is not usually involved unless there’s a serious mistake. Small problems, suggestions or hints don’t usually reach the author at all.
This also changes the perceived result of the code reviews: it’s seen as something that only produces additional work, as opposed to something where everyone (the reviewer and the author of the code) can learn new things every time.
1
Pre-commit code reviews seem so natural, it never occurred to me that reviews could deliberately be done after committing. From a continuous integration perspective, you want to commit your code once it is finished, not in a working-but-possibly-poorly-written state, right ?
Maybe it’s because the way we’ve always done it in my teams is live dialog initiated by the original developer, not asynchronous, control-driven, document-based reviews though.
3
Most repositories today support a two-phase commit or a shelveset (private branch, pull request, patch submission or whatever you want to call it), that will allow you to inspect/review work before pulling it into the main line. I would say that leveraging these tools would allow you to always do pre-commit reviews.
Also, you might consider pair coding (senior pairs with junior) as another way of providing a built-in code review. Consider it as a quality inspection on the assembly line instead of after the car has rolled off.
6
Do both :
- pre commit – do this kind of reviews when it is something very important, like a very reusable code piece, or major design decision
- post commit – do this kind of a reviews when you want to get opinion on a piece of code that might be improved
Any formal review should be undertaken on source files that are under configuration control, and the review records clearly stating the revision of the file.
This avoids any “you haven’t got the latest file” type arguments, and ensures everyone is reviewing the same copy of the source-code.
It also means that, should any post-review corrections be required, the history can be annotated with that fact.
For the code review itself, my vote is for ‘during’ the commit.
A system like gerrit or clover (I think) can stage a change and then have the reviewer commit it to the source control (push in git) if it’s good. That’s the best of both world.
If that’s not practical, I think that after commit is the best compromise. If the design is good then only the most junior developers should have things bad enough you don’t want them committed ever. (Make a pre-commit review for them).
Which leads to design review – while you can do it at code review time (or for that matter at customer deployment time), finding design issues should be done earlier than that – before the code is actually written.
With peer review there is a minimal risk of losing control. All the time two people have knowledge about the same code. They have to switch occasionally, so they have to be attentive all the time to keep track of the code.
It makes sense to have a skillful developer and a newbie working together. At first glance this seems to be inefficient, but it isn’t. In fact, there are fewer bugs, and it takes less time to fix those. Besides, the newbies will learn much faster.
What comes to preventing bad design, this should be done before coding. If there is any considerable change/improvement/new piece of design, it should be reviewed before coding starts. When design is completely developed, there is not much left to do. Reviewing the code will be easier and will take less time.
I agree that it is not essential to review code before committing only if the code is produced by an experienced developer, who have already proved their skills. But if there’s a newbie, code should be reviewed before committing: the reviewer should sit next to the developer and explain every change or improvement made by them.
Reviews benefit from both pre- and post- commits.
Pre-review commit
- Gives reviewers confidence they are reviewing the author’s latest revision.
- Helps ensure everyone reviews the same code.
- Gives a reference for comparison once revisions made from review items are complete.
No Running Commits During the Review
I have used Atlassian tools and have seen running commits occur during the review. This is confusing to reviewers, so I recommend against it.
Post Review Revisions
After reviewers complete their feedback, verbally or in writing, the moderator should ensure the author makes the requested changes. Sometimes reviewers or the author may disagree as to whether to designate a review item as a fault, suggestion, or an investigation. To resolve disagreements and ensure investigation items are cleared correctly, the review team depends on moderator judgment.
My experience with around 100 code inspections is that when reviewers can reference an unambiguous coding standard, and for most kinds of logic and other programming faults, review results are generally clear cut. Occasionally there is a debate about nit-picking or a point of style can degenerate to argument. However, giving decision power to the moderator avoids stalemate.
Post-Review Commit
- Gives the moderator and other reviewers a data point to compare against the pre-review commit.
- Provides metrics for to judge both the value and success of the review at defect removal and code improvement.
Me and my colleagues did some scientific research on this topic recently, so I’d like to add some of our insights despite this being a rather old question. We built a simulation model of an agile Kanban development process/team and compared pre commit and post commit review for a large number of different situations (different number of team members, different skill levels, …). We looked at the results after 3 years of (simulated) development time, and looked for differences with regard to efficiency (finished story points), quality (bugs found by customers) and cycle time (time from start to delivery of a user story). Our findings are as follows:
- The differences in terms of efficiency and quality are negligible in many cases. When they are not, post commit review has some benefits with regard to quality (other developers act as “beta-testers” in a way). For efficiency, post commit has some benefits in small teams and pre commit has some benefits in large or unskilled teams.
- Pre commit review can lead to longer cycle times, when you have a situation in which the start of dependent tasks is delayed by the review.
From these, we derived the following heuristic rules:
- When you have an established code review process, don’t bother changing, its probably not worth the effort
- unless you have problems with cycle time => Switch to Post
- Or slipped issues disrupt your developers too often => Switch to Pre
-
If you aren’t doing reviews yet
- Use pre commit if one of these benefits is applicable for you
- Pre commit review allows outsiders without commit rights to contribute to open source projects
- If tool-based, pre commit review enforces some review discipline in teams with otherwise lax process adherence
- Pre commit review easily keeps unreviewed changes from being delivered, which is neat for continuous deployment / very short release cycles
- Use pre commit if your team is large and you can live with or
circumvent the problems in cycle time - Otherwise (e.g. small, skilled industrial team) use post commit
- Use pre commit if one of these benefits is applicable for you
- Look for combinations that give you the benefits of both worlds (we haven’t studied these formally)
The full research paper is available here: http://dx.doi.org/10.1145/2904354.2904362 or on my website: http://tobias-baum.de
1
It depends on your team make up. For a relatively experienced team that is good about small, frequent commits then post-commit review just to get a second pair of eyes on the code is fine. For larger, more complex commits and/or for less experienced developers then pre-commit reviews to fix problems before they get in seems more prudent.
Along those lines, having a good CI process and/or gated check-ins lessens the need for reviews before commit (and arguably post commit for many of them).
In my opinion, code peer review works best if done post-commit.
I would recommend adjusting your branching strategy. Using a developer branch or feature branch has a number of benefits… not the least of which is facilitating post-commit code reviews.
A tool like Crucible will smooth and automate the review process. You can select one or more committed changesets to include in the review. Crucible it will show which files were touched in the selected changesets, keep track of which files each reviewer has already read (showing a % complete overall) and let the reviewers easily make comments.
http://www.atlassian.com/software/crucible/overview
Some other benefits of user/feature branches:
- Developers get the benefits of version control (back up changes, restore from history, diff changes) with less worry about breaking the system for everyone else.
- Changes which cause defects or don’t get finished on time can be backed out, re-prioritized, or shelved if necessary.
For inexperienced developers, regular consultation with a mentor and/or pair programming is a good idea, but I wouldn’t consider this a “code review”.
1
Both. (Kind of.)
You should be reviewing your own code summarily before committing it. In Git, I think the staging area is great. After I’ve staged my changes, I run git diff --cached
to see everything that is staged. I use this as an opportunity to make sure I’m not checking in any files that don’t belong (build artifacts, logs, etc.) and making sure that I don’t have any debug code in there or any important code commented out. (If I’m doing something that I know I don’t want to check in, I usually leave a comment in all caps so that I’ll recognize it during staging.)
Having said that, your peer code review should be generally be conducted after you commit, assuming that you are working on a topic branch. This is the easiest way to make sure that everybody else is reviewing the correct thing, and if there are major problems, then it’s no big deal to fix them on your branch or delete them and start over. If you conduct code reviews asynchronously (i.e. using Google Code or Atlassian Crucible), then you can easily switch branches and work on something else without have to separately keep track of all your different patches/diffs that are currently under review for a few days.
If you’re not working on a topic branch, you should. It reduces stress and hassle and makes release planning much less stressful and complicated.
Edit: I should also add that you should be doing code review after testing, which is another argument in favor of committing code first. You don’t want your test group fiddling around with dozens of patches/diffs from all the programmers, then filing bugs just because they applied the wrong patch at the wrong place.
100% paired programming (no matter how senior you think you are) with lots of small commits and a CI system that builds on EVERY commit (with automated testing including units, integration and functional wherever possible). Post-commit reviews for large or risky changes. If you must have some kind of gated/pre-commit review, Gerrit works.
The advantage of code review on check in (buddy check) is immediate feedback, before large pieces of code have been completed.
The disadvantage of code review on check in is that it can discourage people from checking in until long stretches of code have been completed. If this happens, it completely negates the advantage.
What makes this more difficult is that not every developer is the same. Simple solutions do not work for all programmers. Simple solutions are:
-
Mandated pair programming, which allows to do frequent check ins because the buddy is right next to you. This ignores that pair programming doesn’t always work for everybody. Done properly, pair programming can also be really tiring so it’s not necessarily something to do all day.
-
Developer branches, code is only reviewed and checked in main branch when finished. Some developers are prone to work in secrecy, and after a week they come up with some code which may or may not pass review because of fundamental problems that could have been spotted earlier.
-
Review on each check-in, which guarantees frequent reviews. Some developers are forgetful and rely on very frequent check ins, which means others have to do code reviews every 15 minutes.
-
Review at some unspecified time after check-in. Reviews will get pushed out further when there’s a deadline crunch. Code that depends on already committed but not yet reviewed code will be committed. Reviews will flag issues and the issues will be put on the backlog to be fixed “later”. Ok, I lied: This isn’t a simple solution, it isn’t a solution at all. Review at some specified time after check-in works, but is less simple because you have to decide what that specified time is
In practice, you make this work by making it even simpler, and more complex at the same time. You set simple guidelines, and let each development team figure out as a team what they need to do to follow these guidelines. An example of such guidelines is:
- Work is broken down in tasks that should take less than a day.
- A task is not finished if the code (if any) has not been checked in.
- A task is not finished if the code (if any) has not been reviewed.
Many alternative forms of such guidelines are possible. Focus on what you actually want (peer reviewed code, observable work progress, accountability) and let the team figure out how they can give you what they want.
Commit where? There’s a branch that I created to do some work. I commit to that branch whenever I feel like it. It’s nobody’s business. Then at some point that branch is integrated into a development branch. And somewhere in between is a code review.
The reviewer reviews obviously after I committed to my branch. He’s not sitting at my desk, so he cannot review it before I commit to my branch. And he reviews before the merge and commit to the development branch.
we actually do a hybrid on LedgerSMB. Committers commit changes that get reviewed after. Non-committers submit changes to committers to be reviewed before. This tends to mean two tiers of review. First you get a mentor to review and mentor you. Then that mentor gets the code reviewed a second time after he or she’s signed off on it and feedback circulates. New committers usually spend a lot of time reviewing other people’s commits at first.
It works pretty well. The thing though is a review after is usually more cursory than a review before, so you want to make sure that review after is reserved for those who have proven themselves. But if you have a two tier review for the new folks it does mean that problems are more likely to be caught and discussions had.
Just don’t do code reviews at all. Either you believe your developers are capable of writing good code, or you should get rid of them. Errors in logic should be caught by your automated tests. Errors is style should be caught by lint and static analysis tools.
Having humans involved in what should be automatic processes is just wasteful.
1