When You “Git” in Trouble: a Version Control Story

Written by sjsyrek | Published 2017/09/24
Tech Story Tags: git | programming | software-development | code | github

TLDRvia the TL;DR App

Thank you for laughing at my extremely funny title. But do you know what’s not funny? When you push a commit to your git repo, and you see this in GitHub Desktop:

Yes, I know that the cool people use Git Tower and that the really cool people just use the command line. We’re really cool people, so we’re going to use the command line to solve this problem. In fact, we have no choice — and that’s the adventure you’re joining me on in this article: fixing a git repo that suddenly becomes damaged through absolutely no fault of your own and despite having no command line git expertise whatsoever. But at least you have a visualization of my panic.

Step one is diagnosing the problem. If you’re like me, however, you also have a meta problem, and that comes from relying on a tool you barely understand, which makes diagnosis difficult. Finding sympathetic experts to help you resolve your specific case is also a challenge. You might not even know who to ask or what the questions should be. Nevertheless, you can learn how things work and how to resolve your own dilemmas piecemeal—if you’re patient, systematic, and willing to learn. And that, friends, entailed for me a trip to the git reference documentation, where I discovered the git-fsck command, which I dutifully ran in my repo’s root directory, and which resulted in the following (truncated) output:

> git fsck...error: object file .git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f is emptyerror: object file .git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f is emptyfatal: loose object 6799ddac675cab54060cdfb066dbfadb6708fc3f (stored in .git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f) is corrupt

So what we have here, and what you probably have if one of your projects comes down with a case of the blinkies, is a corrupted repository. Oh. Good to know. Now what? When you want to know the meaning of life, you ask God. In this case, I consulted an ancient email from Linus Torvalds, which happens to address a similar situation.

A git repo is really a graph of various kinds of binary objects: blobs, trees, and commits. Blob objects are cryptographically-hashed, well, blobs of your data, each representing one of your files. These blob objects are independent of one another, but they are also linked by so-called tree objects, which effectively group the blobs into an arrangement that’s analogous to a file system’s directory structure. Finally, there are commit objects, which contain the information necessary to track the changes in your trees and blobs. The commit objects are also linked, sequentially (as you might expect).

Git stores all of these objects in a series of nested directories, located in .git/objects/, according to the first few characters of their commit ids (as you can see above). For example, the object 6799ddac675cab54060cdfb066dbfadb6708fc3f is stored in a directory called 67/ as the file 99ddac675cab54060cdfb066dbfadb6708fc3f; that is, the full object name is a combination of the directory it's stored in and a specific file in that directory.

So, if one of the commit objects becomes corrupted, your whole repo may turn into a useless pile of bytes, because the chain of linked commits will have been broken. That’s the bad news. The good news is that for the very reason your repo is a collection of discrete files, you may be able to restore it to health, even if one of the objects is corrupted beyond repair — if you can perform a precise enough surgery.

That’s what I tried to do.

Following Linus’s advice, I moved the corrupted commit object file ./git/objects/67/99ddac675cab54060cdfb066dbfadb6708fc3f somewhere else. You can park your damaged objects wherever you like. They will probably end up in the trash, anyway.

I happened to get the same error message for blob object 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf, so I moved that one too. Then I tried the file system check again:

> git fsckChecking object directories: 100% (256/256), done.Checking objects: 100% (8970/8970), done.broken link from    tree 03a88f876eb3f6157f76461a3ae6cb18bbb86561              to    blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcfdangling commit 76814e15074b540bc2f7e78daf3f5175a8759523missing commit 6799ddac675cab54060cdfb066dbfadb6708fc3fmissing blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcfdangling blob 2a60520000698ad964e4e61fab31f9b862763550dangling commit 41634cd81964068acb153bfa355d63bd80fc7cefdangling commit 5bf415e2bdbc47822ae99b64c2a0f6b4f288eefb

Note that Linus suggests using git fsck --full, but this is the default behavior now.

Ignoring the “dangling commit” messages, the “broken link” message tells me which tree object points to the blob object I just removed. In effect, I broke the link on purpose to reveal this information. Tree object 03a88f876eb3f6157f76461a3ae6cb18bbb86561 expects to point to blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf, but the blob isn't there. Commit object 6799ddac675cab54060cdfb066dbfadb6708fc3f, the other one I moved, is also reported missing. So far, so good.

Continuing with Linus’s advice, I now had enough information to use the git-ls-tree command to list the contents of the tree object called out above:

> git ls-tree 03a88f876eb3f6157f76461a3ae6cb18bbb86561100644 blob 312d8994f1005a9563a9410c592b27000c201101	building-test.js100644 blob f84006fd14c6d4b2ccc3ef22b2fe02abf535bd1a	folds-test.js100644 blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf	index.js100644 blob 8b1f47bce7ec989dff7e936279d63d1d02f6a92d	indexing-test.js100644 blob 3f2b45f8cd9dfd486c8e821ee672ed66a34768df	inf-test.js100644 blob 0e6d1985aa59d17e2115bc6c7936d2ac88b00457	list-test.js100644 blob 40310b5df53691d0e1ba4118c0e3ab66ed766990	reducing-test.js100644 blob 8244d5fb2768ad5c7c33890ee26c797c2df6262b	searching-test.js100644 blob ff34a2f15faf6eec7d9c9635e79d2a0abdadfb42	sub-test.js100644 blob ac0c029cc64870c7445c4bdd9d7fe20646b5cc33	trans-test.js100644 blob 99781d303e90b7aa4de8d630c1053a42f87e8331	zip-test.js

Scanning the list, I found the culprit blob 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf and its associated file: index.js. So now I knew the source of the problem, but I didn't know which version of that file created the problem to begin with. Back to the command line:

> git log --raw --allcommit cf63a71497e027d96614cfff6ba1d297f1a1a26eAuthor: Steven Syrek <steven.syrek@example.com>Date:   Mon Jul 18 11:55:40 2016 -0400    Add tests for set operations on lists:100644 100644 67a45ac... c1c2f99... M  test/list/index.js:000000 100644 0000000... 23c47fe... A  test/list/set-test.jscommit f3bc2c55b22deb889f99cdd45663c20a8e8e79c1Author: Steven Syrek <steven.syrek@example.com>Date:   Mon Jul 18 11:14:13 2016 -0400    Add tests for list zipping and unzipping functions and remove exponentiation operator from tests and examples:100644 100644 01af47b... 3b1bf35... M  source/list/zip.js:100644 100644 21206e2... 67a45ac... M  test/list/index.js:000000 100644 0000000... 99781d3... A  test/list/zip-test.js

The git-log command, with the --raw and --all options, will show the entire commit history of a repo. I only show the relevant parts from mine above. What we can see here is that object 21206e20386e0365bc6f15d0ccd372b1c72b5667 precedes the corrupted object 67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf, which is in turn followed in the subsequent commit (they are listed in reverse order) by object c1c2f99072ef41aca89e963cfb0143f897e0de78.

At this point, Linus says that I’m done, because I discovered which versions of the file preceded and followed the corrupted commit:

If you can do that, you can now recreate the missing object with git hash-object -w <recreated-file> and your repository is good again!

Unfortunately, after I tried this, my repository was not good again. Now things started to get hairy. Past Linus was out of advice, and present Linus (now also past Linus) probably had better things to do than help me. I was therefore left to follow the astute troubleshooting process that professional developers use every day:

Google. Google.

Stack Overflow. omg it’s down

This throw-everything-at-the-wall approach led me to try a few things, starting with the git-diff command. If I couldn’t automatically re-create the missing object, I reasoned desperately, perhaps I could do it manually:

> git diff 206e20386e0365bc6f15d0ccd372b1c72b5667..c2f99072ef41aca89e963cfb0143f897e0de78fatal: ambiguous argument '206e20386e0365bc6f15d0ccd372b1c72b5667..c2f99072ef41aca89e963cfb0143f897e0de78': unknown revision or path not in the working tree.

Oops. I forgot the leading characters:

> git diff 21206e20386e0365bc6f15d0ccd372b1c72b5667..c1c2f99072ef41aca89e963cfb0143f897e0de78diff --git a/21206e20386e0365bc6f15d0ccd372b1c72b5667..c1c2f99072ef41aca89e963cfb0143f897e0de78 b/c1c2f99072ef41aca89e963cfb0143f897e0de78index 21206e2..c1c2f99 100644--- a/21206e20386e0365bc6f15d0ccd372b1c72b5667..c1c2f99072ef41aca89e963cfb0143f897e0de78+++ b/c1c2f99072ef41aca89e963cfb0143f897e0de78@@ -25,3 +25,7 @@ export * from './sub-test'; export * from './searching-test'; export * from './indexing-test';++export * from './zip-test';++export * from './set-test';

Above are the lines in index.js that changed between the two commits on either side of the corrupted commit. They are marked with a +, with a few surrounding lines also shown for context. Deleted lines, if there had been any, would have been marked with a -. Since two identical files, when hashed, should produce identical hash keys, I thought I'd try to brute force a solution by deleting the changed lines and re-creating the commit by hand:

> git hash-object -w ./test/list/index.js2a60520000698ad964e4e61fab31f9b862763550

Nope. Try again, maybe just deleting the lines marked +.

> git hash-object -w ./test/list/index.js21206e20386e0365bc6f15d0ccd372b1c72b5667

Nope, but interesting. I managed to recreate the original state of the object before the corrupted commit happened, but I guess what I was really trying to do was re-create the correct intermediate state? There weren’t too many possibilities, fortunately, since I had uncharacteristically been going through a good git hygiene phase. So I changed the file once more, adding to it only those lines marked with a + that I recalled adding before everything went bits up:

git hash-object -w ./test/list/index.js67a45ac2f58a444fa4db11cd9ab7e024a8e35dcf

Yay.

> git fsckChecking object directories: 100% (256/256), done.Checking objects: 100% (8970/8970), done.dangling commit 76814e15074b540bc2f7e78daf3f5175a8759523missing commit 6799ddac675cab54060cdfb066dbfadb6708fc3fdangling blob 2a60520000698ad964e4e61fab31f9b862763550dangling commit 41634cd81964068acb153bfa355d63bd80fc7cefdangling commit 5bf415e2bdbc47822ae99b64c2a0f6b4f288eefb

Oh right, I have a healthy blob now, but I’m still missing the commit object that points to it. Now what? It’s git-gc to the rescue!

> git gcerror: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3ferror: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3fwarning: reflog of 'HEAD' references pruned commitswarning: reflog of 'refs/heads/restructure' references pruned commitserror: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3ffatal: Failed to traverse parents of commit b267a6a8264c0cdc72d047049610fc91e9f7c06ferror: failed to run repack

Or not. That was supposed to garbage collect all the… garbage. And fix… all the things. I don’t know why I thought that. But I hoped. I really, truly hoped. And then I imprecated. Noting the “reflog” message above, I was on to my next brilliant idea:

> git reflog expire --all --stale-fixerror: Could not read 6799ddac675cab54060cdfb066dbfadb6708fc3ffatal: Failed to traverse parents of commit b267a6a8264c0cdc72d047049610fc91e9f7c06f

As everyone knows, when using command line tools, the more options you add, the more masculine you are. It doesn’t matter if you don’t know what they do. Real men don’t read man pages: they just move fast and break things. Plus, I rather liked the idea of re-flogging my repo. But no. That didn’t work, either.

I had by now waded well into the waters of trying absolutely anything, without regard for sense or soundness. I turned back to the logs, which always feels one step shy of admitting defeat and finding a corner in which to quietly weep. But perhaps a solution would miraculously present itself, something I missed before but was there the whole time for all to see?

> git log 6799ddac675cab54060cdfb066dbfadb6708fc3ffatal: bad object 6799ddac675cab54060cdfb066dbfadb6708fc3f

Nope.

> git ls-tree 6799ddac675cab54060cdfb066dbfadb6708fc3ffatal: not a tree object

Nope. I mean, duh. Somehow, I then had the bright idea of examining the logs for just the restructure branch of my repo, which is the one I had been working on when the fatal blinking cursor entered my life:

> tail -n 40 .git/logs/refs/heads/restructure...44dc22e706fb029a9c96f3bd125755fd55ac882b 6799ddac675cab54060cdfb066dbfadb6708fc3f Steven Syrek <steven.syrek@example.com> 1468788351 -0400	commit: Replace isEq function in all tests with should.eql6799ddac675cab54060cdfb066dbfadb6708fc3f b267a6a8264c0cdc72d047049610fc91e9f7c06f Steven Syrek <steven.syrek@example.com> 1468789759 -0400	commit: Separate out functions in Ord tests...

Huh. “Maybe the same diff thing I did for the blob objects will work on the commit objects,” I thought. So:

> git diff b267a6a8264c0cdc72d047049610fc91e9f7c06f..44dc22e706fb029a9c96f3bd125755fd55ac882b...(bunch of irrelevant stuff)

OK. No. But at least I still had a commit hash, 44dc22e706fb029a9c96f3bd125755fd55ac882b, to do something with. It was the last good one before my arch nemesis, 6799ddac675cab54060cdfb066dbfadb6708fc3f, darkened my world. I consulted the docs. I consulted the Internet. And I took one more shot in the dark:

> git branch -l rewrite-tests 44dc22e706fb029a9c96f3bd125755fd55ac882b

What I did here was to create a new branch called rewrite-tests, using the 44dc22e706fb029a9c96f3bd125755fd55ac882b commit—i.e. the last good one—as its start point, in accordance with the git branch [--set-upstream | --track | --no-track] [-l] [-f] <branchname> [<start-point>] pattern specified in the git-branch docs. I am not actually sure what the -l option is for, or even whether it's necessary. Someone said to use it. Shrug.

I then moved all the files out of the repo and did one of these:

git checkout rewrite-tests

The git-checkout command sets HEAD to the specified branch. In other words, I told git that I wanted to work on the rewrite-tests branch. Then, I just copied all the files back over, re-committed them, and left the restructure branch to wither and die.

And just like that, to my astonishment, I was done. The worst was over and none too soon: I was starting to see in hash keys. I had a new branch to develop, and none of my work was lost (despite the untimely deaths of a few intervening commits). Eventually, I squashed everything back into master, though I now tend to avoid working on that branch directly, in any repo, in the event one of these kerfuffles arises again.

I still visit Ms. Blinky from time to time, just to gloat. Actually, no, I don’t do that. But you can visit my wounded-and-repaired repo yourself, if you like: it contains my maryamyriameliamurphies.js project. It’s a substantial amount of code that I worked on entirely alone. You can imagine how I felt when I thought I might have ruined it. And how I felt when I figured out how to fix it.

At the beginning of this article, I suggested that a damaged git repository — since it is composed of discrete objects — could potentially be recovered through careful surgery. We have seen two possibilities for such an operation. The first, in the case of damaged blob objects, is to excise the offending blob(s) and then suture over the wound through a rehashing of the original file. The second, if the first fails (or if the problem is a damaged commit object, not just a blob), is to amputate the wounded branch at the point of its corruption, graft a new branch onto the stump, and recommit any files that were casualties of the procedure.

These are different solutions but similar in that they both entail repairing a data structure at a rather low level, even if it’s only manipulating files. In fact, the repair operations are possible precisely because a git repo is stored as a series of files. Insofar as a file system is really just a large data structure with an interface, the command line, a git repo is also a file system-like data structure with an interface, the git command and its various sub-commands and options. If you can learn how to use a file system from the command line, in other words, you can learn how to use git, too.

I wish this story had resolved into a set of specific instructions for solving a problem that you yourself might one day encounter. Unfortunately, I only have some trite encouragement to offer: you can do it! Because it’s so hard to know the cause of these sorts of errors, not to mention the best way to fix them, it’s also hard to generalize about them. All you can do is dig in and fight back against entropy. Learn your tools, don’t fear them. Try things yourself, if only for the experience, and before you beg one of your scientician friends to help you. Just remember to back up first!

If there is an obvious moral here, it’s the one that makes recovery much, much easier and far less costly should you ever be confronted by that dreaded, blinking status bar or its obtuse command line equivalent: commit early, and commit often. Preventive care, after all, is often the best medicine.


Published by HackerNoon on 2017/09/24