I just refactored my project’s entire codebase. So much so that even though it uses most of the same code base, things work in a radically different way. If the old version was 1.0, the new one would be 2.0. The project itself is just under 1mb in size (its a tiny little lib). I started the project a long time ago and its undergone many changes… so many that my git folder is now over 3mb in size.
In this case 3mb is a very small amount of data, but from a ‘big picture’ perspective, when should you cut your previous VCS history out of your current project and start over? Or should you never ever do this?
7
Storage is cheap, just keep it.
Sometimes you have to check if some bug was present in the pre-refactor version or reference some old piece of code.
If you really are concerned about the git
checkout size then sure, you can go and delete the history, but I’d suggest starting a new repo in this case and leave the refactor commit in old one, so that when you really need it then you can easily connect them.
A git
specific note: if you don’t want the history for current checkout (because you just want to compile the project or create patches for current problem) you can use --depth=1
, then git
won’t clone any history. Second thing: you can create an --orphan
branch that won’t share any history with other branches in repo, you can then clone only this branch and its history omitting all other objects in remote git repo.
2
My “rule of thumb” about the lifetime of version history in any kind of VCS: keep the version history alive as long as you have to maintain that specific software.
For example, I am working here with a product which is older than 10 years and maintained and evolved constantly. Our subversion repo for that product has about 500MB, but actually that is “peanuts”. Having the complete log available over the whole product lifetime was often very useful to understand things which happened some time ago in the past.
I would suggest that you keep entire history of git commits as they help one keep logs and maintain history of major and minor changes. But I would like to suggest forking of code to different repository once major update is released thus adding only minor updates or bug fixes to older versions and focusing majorly on new major version.
This help maintain a good development structure and also help one scales down the rate at which repository grows. This is the major revision management structure which is followed by many open source projects.
1
I’m a Mercurial guy, I also use a little git.
I can tell you that in Mercurial editing / removing history is discouraged, but not impossible. Everybody lives calm with this way of doing things and I think it is a better way to go. Why edit the history unless you screw up?
Git enables you to edit history, but that doesn’t mean you should. This creates problems for people, nothing really good comes out of it except a tidier history, but you shouldn’t care as long as you keep your commits reviewable. My perception is that with Git, people can go OCD with the history editing power.
If you absolutely have to, however, never-ever-ever-ever modify history after pushing, you will make people mad.