Rewriting history - Git history that is
Remove unwanted commits from your Git history
I recently co-created a PhD course on reproducibility for quantitative data science with Dr Melanie Ganz. We also wanted to make the content Open, not just for students but also for other researchers and teachers. One thing we do is get students familiar with Git and GitHub. Since that was our 1st attempt at this course, I let people push and test stuff in the main repository. Because I also have help from guest teachers, I wanted to keep those while deleting students' commits (next time I'll use a branch). Time to clean.
Note: All the documentation exists online, but there are always extra steps one needs to do which are not part of the command document, the idea behind this blog is to chain all the commands you need
Use the doc and tools
A quick search led me to find (i) the git documentation 'Rewriting History' (ii) a Git history editor, straightforward to edit the history, but I wanted to delete stuff, (iii) various discussions on how to use rebase.
Edit your own commits with git -rebase
This is a command where the magic happens.
step 1: git branch backup
well, let's make sure not to screw up everything and make a copy, JIC (we'll delete that later once we know the job is done - sure there is the git reflog command but I don't like it)
step 2: check how far back you want to go (which commit)
git log --pretty=format:"%h - %an, %ar : %s"
it can also be useful to look for specific users and then their commit
git shortlog -sn --all
git log --author="name"
step 3: git rebase -i <commit ID>
using the -i argument is useful to do things interactively with your editor. Ah yes, you do need to tell git what editor to use (git config --global core.editor "'some command or link to exe, depends your OS' -w")
step 4: delete
Because one now uses an editor, simply use the drop command in front of the commit I wanted to remove (instead of pick). As you will see, there are many commands you can use.
step 5: anonymize (change author names)
use the edit command in the edit when calling rebase
now interactively in the command window, do the editing
git commit --amend --author="John Doe <john@doe.org>" --no-edit
git rebase --continue
pro tip: better to do a few rebase than a big one, avoiding conflicts
Delete PR: squash and merge
Branch and cherry-pick
git cherry-pick commit2^..commit_latest
git push origin newbranch
Done
checking my clean commit history
git diff whaterver_branchname backup
checking against my backup that I'm happy with the actual repo
git branch -D backup
et voila :-)
Comments
Post a Comment