git bisect for the win

I recently discovered the git bisect command. At first I thought « meh, this is just another obscure git command I’ll never use » but it actually turns out I used it several times in the last week and it saved me a non negligible amount of time. I’m therefore writing this post to make a quick introduction to git bisect.

What problem does “git bisect” solves?

The goal of git bisect is to find the commit that introduced a particular change. This change can be:

  • your tests not passing anymore
  • a feature not working properly (but this one should imply that your tests are not passing, right  ? 😉 )
  • a performance issue
  • … or pretty much anything else.

(Here I’ll assume that the change you’re looking for is a bug for clarity.)

If you wanted to do this manually, you’d have to

  1. Find a commit that doesn’t have the bug (e.g. a commit where your tests weren’t passing)
  2. Find a commit that has this bug (e.g. a commit where your tests are *not* passing)
  3. Checkout all the commits in between to see which one introduced the bug

The latter step is obviously the most time-consuming one. If you make commits that are small and meaningful enough (and you should), it will often happen that you make something like 10 commits a day. What if your tests were passing 10 days ago but are not passing anymore, would you really want to go through the hassle of checking out 100 commits and testing them one by one? Probably not.

At this point, if you’ve ever followed a basic algorithm course, you probably figured out that doing a linear search on those commits is by far not the best algorithm. It’s indeed easy to use an algorithm making use a binary search.

  • Inputs:
    • a “good” commit commit_good, that doesn’t contains the bug you’re looking for
    • a “bad” commit commit_bad, that contains this bug
  • Algorithm: run a binary search to find the commit that satisfies the following properties
    • C contains the bug
    • The commit just before C doesn’t contains the bug
  • Output: C, which is the faulty commit introducing the bug.

Now, you should be able to find the faulty commit in roughly time log2(number of commits between commit_good  and commit_bad). In the example I introduced earlier, that would be roughly log2(100) namely 5 instead of 50 in average if you were using a linear search (given that you have to go through only half the commits in average, assuming that each commit between our two bound commits has an equal probability to be “bad”).

If you think about it, it is still quite a pain to do it manually. First, you need to checkout the commit in the middle between commit_good and commit_bad, let’s call it commit_pivot. Then, if this commit has the bug, it means that the bug has been introduced by an older commit and that you need to checkout the commit in the middle between commit_good and commit_pivot. It it doesn’t have the bug, it means that the bug has been introduced by a newer commit and that you need to checkout the commit in the middle between commit_pivot and commit_bad.

And to repeat the process. All that without making off-by-one errors, which is very likely  to happen since you’re computing all that in your head, and – well – you’re not a machine, even if that’s what your non-tech friends think.

A new hope: git bisect

Git bisect basically automates this process for you. First, you need to inform git that you want to start the process.

$ git bisect start

At any point, you can stop the process by running

$ git bisect reset

Then, just like our previous algorithm, git expects that you give it a “bad” and a “good” commit. Let’s say that my tests were passing 10 commits ago, and that they’re not passing anymore.

$ git bisect good HEAD~10
$ git bisect bad HEAD

(You can also a the commit number to this command such as a4211cb)

Once you have given git a good and a bad commit, it will automatically start the process.

Bisecting: 5 revisions left to test after this (roughly 3 steps)
[778a1ad821098553f6ab35fea0a4f3ea44646ece] Fix a null pointer exception happening if the user doesn't enter his name

git checked out the commit 778a1ad, meaning that your working directory contains the exact same files as when this commit was made. You now need to manually test if your program was working at this point.

$ ./tests --all
...
All tests passed
BUILD SUCCESSFUL

In this case, the commit is good, meaning that the bug was introduced in a newer commit. Let’s tell this to git.

$ git bisect good
Bisecting: 2 revisions left to test after this (roughly 2 steps)
[b7ac812b822689fe4394e673d68fb0e684843e24] Add an awesome feature

Again, run your tests or checks against the commit that git checked out for you.

$ ./tests --all
...
1 test failed
BUILD FAILED

This commit appears to be bad. Let’s one more time tell this to git.

$ git bisect bad

Do this as long as necessary. If at some point you are unable to test a commit for some reason, you can skip it by running

$ git bisect skip

When git is able to figure out the faulty commit, it’ll let you know:

b7ac812b822689fe4394e673d68fb0e684843e24 is the first bad commit
commit b7ac812b822689fe4394e673d68fb0e684843e24
Author: Christophe Tafani-Dereeper <[email protected]>
Date: Sun Nov 20 13:49:07 2016 +0200

  Remove an ununsed function

Unused function, huh?

Don’t forget to exit the bisecting process by typing git bisect reset.

Automating the process

I know, this is still a painful process. Fortunately, git lets you automate it! Once you have given him your bad and your good commit, you can use git bisect run and pass it as an argument your test command.

$ git bisect good HEAD~10
$ git bisect bad HEAD
$ git bisect run ./tests --all

Git will basically go through the same process as we did in the previous section, and will mark commits as “good” or “bad” based on the exit status of your test command. If the exit status of your test command is 0 for the current commit, it will be marked as good. In any other case, the commit will be marked as bad. You should see an output looking like:

This also allows us to write custom scripts if we want to detect particular regressions. Let’s say for instance that your API was taking roughly 1 second to start yesterday, but that it’s today taking more than 10 seconds for some obscure reason. We could write a small script that attempts to make a HTTP request to it with a 2 seconds timeout, and will exit with a non-zero status code if it fails to do so.

test-timeout.sh

#!/bin/bash
curl --max-time 2 localhost:1234

Then we can just use

$ git bisect run ./test-timeout.sh

running ./test-timeout.sh
Bisecting: 4 revisions left to test after this (roughly 2 steps)
[c1aa78ac6f40c8a8b6ab6547582a67f326a7ef27] Fix a null pointer exception

running ./test-timeout.sh
Bisecting: 1 revision left to test after this (roughly 1 step)
[6ef81697ce2544c9535b6911ab6aa46cbe1929c5] Add  logout button

running ./test-timeout.sh
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[d9345a063eaae9f1ffaf352ba04a8d4c50a96ca2] Remove link from sidebar

6ef81697ce2544c9535b6911ab6aa46cbe1929c5 is the first bad commit
commit 6ef81697ce2544c9535b6911ab6aa46cbe1929c5
Author: Christophe Tafani-Dereeper <[email protected]>
Date: Sun Oct 30 20:10:50 2016 +0200

 Introduce a fancy library

Et voilà! Note that we could have obtained the same result by running git bisect run curl –max-time 2 localhost:1234 here.

Leave a Reply

Your email address will not be published. Required fields are marked *