I’m a teacher and I’m about to publish a tutorial on how a build a PHP app in several iterative steps. At each iteration, I want to provide access to the associated source code and differences with the code of the previous iteration.
Naturally, I plan on using a Git repository for the project source code with a branch for each iteration. However, I wonder about future fixes and updates to my different branches. If I make a fix on a branch (say, iteration-03), I’d like it to “propagate” to all upcoming branches (iteration-04, iteration-05, and so on). My branches diffs must also stay coherent.
So my questions are :
- What is be the best way to structure a Git repo into branches to achieve my goal ?
- How to propagate fixes to a branch into the following ones ?
Thanks in advance for your answers.
Regards,
Baptiste
1
Git is not a very suitable tool for this. In particular, git was created to keep track of the exact change history of data. While the branching system offers some support to keep different forks up to date, this is not very convenient. Let’s assume you have the following history graph:
a1--a2 branch a
b1 branch b
c1--c2 branch c
You now want to fix an error on branch a
. We do this and commit the change as a3
:
a1--a2--a3 branch a
b1 branch b
c1--c2 branch c
To get the change to the other branches, we can either rebase each branch onto it’s predecessor, which will replay the commits of each branch being rebased:
a1--a2--a3 branch a
b1' branch b
c1'--c2' branch c
Or we could merge each branch into its successor (no fast-forwarding is possible so we’ll get explicit merge commits b2
and c3
):
a1--a2------a3 branch a
b1------b2 branch b
c1--c2--c3 branch c
While rebasing makes the history look “nicer”, the merge commits tell us the real history of the project. In either case, any change will modify all dependent branches, and the history will clearly show the fix commit a3
. Git does not automate keeping the dependent branches up to date, but you could write a script to assist you. Note that any merge or rebase may be interrupted if the changes cannot be resolved automatically, in which case you’ll have to resolve manually.
Using this scheme, the diffs between the branches will stay fairly constant for changes in the base branch, since the change will be propagated to all dependent branches.
Alternative Strategy: Tags ✘
If you want to be able to fix previous ”versions“, then simply recording the history and tagging interesting states is not a viable option. We might start with a history like this:
a1--a2--b1--c1--c2 branch master
↑ ↑ ↑
tag a tag b tag c
We could branch to create a fix
branch where we commit the fix commit a3
:
a3 branch fix
/
a1--a2--b1--c1--c2 branch master
↑ ↑ ↑
tag a tag b tag c
We could then rebase the master
branch onto fix
:
a3--b1'--c1'--c2' branch master
/
a1--a2--b1--c1--c2
↑ ↑ ↑
tag a tag b tag c
Unfortunately, the tags refer to their original commits and were not updated.
Alternative Strategy: Multiple folders with git remote
✔
Then, there’s the possibility of simply storing each iteration in a separate directory. This is actually the most user-friendly option since it does not require git knowledge to access a specific state, but this makes it more difficult to properly track changes. One possibility to use git would be to have a separate git repository in each iteration folder. Each iteration would then mark the previous iteration’s repository as the upstream repository. After each change, you would git pull
the changes for each dependent repo.
While this is the most difficult to set up, this is likely to be the best and most intuitive solution for all participants.
Here is some code to set up an example project using this strategy:
#!/bin/sh
# create the three iterations a, b, c:
mkdir a b c
# fill each repository and set up upstream repos
cd a
git init
git commit --allow-empty -m 'a1'
git commit --allow-empty -m 'a2'
cd ../b
git clone --branch master ../a .
git commit --allow-empty -m 'b1'
cd ../c
git clone --branch master ../b .
git commit --allow-empty -m 'c1'
git commit --allow-empty -m 'c2'
cd ..
Now we create a commit a3
:
cd a
git commit --allow-empty -m 'a3'
cd ..
To update the other repos, we simply issue a git pull
:
cd b
git pull # enter message for merge commit
cd ../c
git pull # enter message for merge commit
cd ..
After I ran those commands, the resulting log in c
looked like
* d8943fb c3
|
| * c670d4f b2
| |
| | * f2ae31e a3
* | | e362453 c2
* | | 6e2a2d7 c1
|/ /
* | c78583c b1
|/
* 3c377e4 a2
* 098f321 a1
This strategy ends up being functionally equivalent to propagating the changes via merging, but might be a bit easier to use – especially for newbies working through your tutorial if they downloaded the code as a zip archive. On the other hand, this makes it more difficult to distribute the complete tutorial as a git repo, since everything is distributed across multiple linked repos. If this is important, the equivalent ordinary merging described at the top is more likely to be better.
0