Often I like to keep a local copy of a large git repository which has the commit history but doesn’t actually download blobs until they’re needed, that is, a blobless clone. From git-clone(1): “–filter=blob:none will filter out all blobs (file contents) until needed by Git.” Compare the size of a complete linux
clone to a blobless one:
$ git clone https://github.com/torvalds/linux.git
$ du -s linux/.git
5524724 linux/.git
$ git clone --filter=blob:none https://github.com/torvalds/linux.git linux-blobless1
$ du -s linux-blobless/.git
2040204 linux-blobless/.git
Now, suppose that I do some operation which downloads other blobs, such as checking out an older tag:
$ git -C linux-blobless checkout v5.11
$ du -s linux-blobless/.git
2217700 linux-blobless/.git
If I then go back to the main branch (git -C linux-blobless checkout master
), the new data sticks around. Is there a way to clean it up, that is, go back to having just the necessary data from the original blobless clone? Put another way, how can I remove all blobs but the ones necessary for the current HEAD?
emron is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.