For my oldest projects, I had source snapshots and releases from before they were first imported into a version control system. With Git, it's possible to insert these into the history of the project.
I did this by creating a new branch that's not attached to the existing history, and then pasting the existing history onto the end of it. Since this requires rewriting the existing history, it's best to do this as part of a conversion, rather than messing up an existing published repository (grafting would be a better option in that case).
First, start a new
history branch within the repository by creating an
orphan commit, dated as early as possible:
git checkout --orphan history git rm -rf . GIT_AUTHOR_DATE="1970-01-01T00:00:00 +0000" \ GIT_COMMITTER_DATE="1970-01-01T00:00:00 +0000" \ git commit --allow-empty -m 'Initial empty commit.'
(We could start with a commit that imports our first tarball instead, but there's no harm in having an empty commit, and it makes editing the history again later a bit easier.)
Next we need to extract each of our tarballs, remove anything we don't want added to Git, commit the rest, and create any release tags we want. Doing this by hand gets tedious very quickly, so I wrote git-import-snapshots:
git-import-snapshots $(ls -Str ~/snapshots/project*.gz)
You'll definitely want to edit the script if you use it yourself — it needs to understand how you've named your snapshots.
In most cases I was able to use the modification date of the snapshot file to identify an appropriate commit date. In a few cases I couldn't do that; instead, I needed to look at the contents of the tarball and find the latest modification date of the files inside it:
tar tzvf snapshot.tar.gz | grep -v '/$' | sort -k 4
Viewing the branch with
gitk history should now show an appropriate
series of commits and tags.
If not, tweak the script's rules and do it again.
Now we need to join the two histories together, by rewriting the
master branch's first commit so that it follows the latest commit on
ref=$(git rev-parse history) git filter-branch -f \ --parent-filter 'sed "s/^\$/-p '$ref'/"' \ --tag-name-filter cat \ master
While we're only really changing the first commit, all the subsequent
ones will need to be rewritten too, in order to catch up with the hash
--tag-name-filter cat option is required in order to preserve
tags (i.e. rewrite them to point at the rewritten commits).
We can now check out the rewritten
master branch, and throw away the
git checkout master git branch -d history