Resting BUILD face

I am super excited that pmbethe09 and lautentlb just put in a bunch of extra work to open source Buildifier. Buildifier is a great tool we use in Google to format BUILD files. It automatically organizes attributes, corrects indentation, and generally makes them more readable and excellent.

To try it out, clone the repo and build it with Bazel:

$ git clone
$ cd buildifier
$ bazel build //buildifier
Extracting Bazel installation...

INFO: Found 1 target...
Target //buildifier:buildifier up-to-date:

INFO: Elapsed time: 203.309s, Critical Path: 7.54s
INFO: Build completed successfully, 8 total actions

Now try it out on an ugly BUILD file:

$ echo 'cc_library(srcs = ["", ""], name = "foo")' > BUILD
$ ~/gitroot/buildifier/bazel-bin/buildifier/buildifier BUILD
$ cat BUILD
    name = "foo",
    srcs = [

Finally, why run commands manually when you can have your editor do it for you? I use emacs, so I can set up a hook like this:

(add-hook 'after-save-hook
            (if (string-match "BUILD" (file-name-base (buffer-file-name)))
                  (shell-command (concat "/path/to/buildifier/bazel-bin/buildifier/buildifier " (buffer-file-name)))
                  (find-alternate-file (buffer-file-name))))))

You could also set up a git hook to run this before committing, if that’s more your style. Regardless, give it a try! It’s a quick, easy way to make your BUILD files more readable.

Labeling Git Branches

I came across git branch descriptions today and it is so freakin useful that I wanted to share.

My branches usually look like this:

$ git branch
* implement-foo

It is… not the clearest. Luckily, you can add descriptions to branches:

$ git checkout implement-foo2
$ git branch --edit-description

---Pops up an editor---

These are the tests for my implement-foo change, they can be cherry-picked onto implement-foo when they are done.

---Save & exit---

The only problem is that these descriptions don’t show up when you do git branch. To display them, use jsageryd‘s script (and vote up the comment, it should really be nearer the top):


branches=$(git for-each-ref --format='%(refname)' refs/heads/ | sed 's|refs/heads/||')
for branch in $branches; do
  desc=$(git config branch.$branch.description)
    if [ $branch == $(git rev-parse --abbrev-ref HEAD) ]; then
       branch="* 33[0;32m$branch33[0m"
       branch="  $branch"
  echo -e "$branch 33[0;36m$desc33[0m"

Save it as something (I called it “branch”), make it executable, add it to your path, and then you can do:

$ branch
* implement-foo2 These are the tests for my implement-foo change, they can be cherry-picked onto implement-foo when they are done.

It’s even got nice colors for the selected branch and descriptions.

––thursday #7: git-new-workdir

Often I’ll fix a bug (call it “bug A”), kick off some tests, and then get stuck. I’d like to start working on bug B, but I can’t because the tests are running and I don’t want to change the repo while they’re going. Luckily, there’s a git tool for that: git-new-workdir. It basically creates a copy of your repo somewhere else on the filesystem, with all of your local branches and commits.

git-new-workdir doesn’t actually come with git-core, but you should have a copy of the git source anyway, right?

$ git clone

Copy the git-new-workdir script from contrib/workdir to somewhere on your $PATH. (There are some other gems in the contrib directory, so poke around.)

Now go back to your repository and do:

$ git-new-workdir ./ ../bug-b

This creates a directory one level up called bug-b, with a copy of your repo.

Thanks to Andrew for telling me about this.

Edited to add: Justin rightly asks, what’s the difference between this and local clone? The difference is that git-new-workdir creates softlinks everything in your .git directory, so your commits in bug-b appear in your original repository.

How to Make Your First MongoDB Commit

10gen is hiring a lot of people straight out of college, so I thought this guide would be useful.

Basically, the idea is: you have found and fixed a bug (so you’ve cloned the mongo repository, created a branch named SERVER-1234, and committed your change on it). You’ve had your fix code-reviewed (this page is only accessible to 10gen wiki accounts). Now you’re ready to submit your change, to be used and enjoyed by millions (no pressure). But how do you get it into the main repo?

Basically, this is the idea: there’s the main MongoDB repo on Github, which you don’t have access to (yet):

However, you can make your own copy of the repo, which you do have access to:

So, you can put your change in your repo and then ask one of the developers to merge it in, using a pull request.

That’s the 1000-foot overview. Here’s how you do it, step-by-step:

  1. Create a Github account.
  2. Go to the MongoDB repository and hit the “Fork” button.

  3. Now, if you go to, you’ll see that you have a copy of the repository (replace yourUsername with the username you chose in step 1). Now you have this setup:

  4. Add this repository as a remote locally:
    $ git remote add me

    Now you have this:

  5. Now push your change from your local repo to your Github repo, do:
    $ git push me SERVER-1234
  6. Now you have to make a pull request. Visit your fork on Github and click the “Pull Request” button.

  7. This will pull up Github’s pull request interface. Make sure you have the right branch and the right commits.

  8. Hit “Send pull request” and you’re done!

Git for Interns and New Employees

Think of commits as a trail future developers can follow. Would you like to leave a beautiful, easy-to-follow trail, or make them follow your… droppings?

My interns are leaving today 😦 and I think the most important skill they learned this summer was how to use git. However, I had a hard time finding concise references to send them about my expectations, so I’m writing this up for next year.

How to Create a Good Commit

A commit is an atomic unit of development.

You get a feeling for this as you gain experience, but commit early and often. I’ve never, ever thought that someone broke up a problem into too many commits. Ever.

That said, do not commit:

  1. Whitespace changes as part of non-whitespace commits.
  2. Debugging messages (“print(‘here!’)” and such).
  3. Commented out sections of code.

  4. Any type of binary (in general… if you think you have a special case, ask someone before you commit)
  5. Customer data, images you don’t own, passwords, etc. Assume that anything you commit will be included in the repo forever.
  6. Auto-generated files (any intermediate building files) and files specific to your system. If it mentions your personal directory structure, it probably shouldn’t be committed.

Point #5 deserves a little extra mention: git keeps everything, so when in doubt, don’t commit something dubious. You can always add it later. When I was new at 10gen, I found a memory leak in MongoDB and was told to commit “what was needed to reproduce it.”

I committed a 20GB database to the MongoDB repo.

One emergency surgery later and the repo was back to its svelte self. So it is possible to remove commits if you have to, but try not to commit stuff you shouldn’t. It’s extremely annoying to fix. And embarrassing.

When you’re getting ready to commit, run git gui. This is the #1 best tool I’ve found for beginners learning how to make good commits. You’ll see something that looks sort of like this:

The upper-left pane is unstaged changes and the lower right is staged changes. The big pane shows what you’ve added to and removed from the file currently selected.

Right click on a hunk to stage it, or a single line from the hunk.

Click on this icon: to stage all of the changes in a file.

Note that notes.js is moved to the staging area (if only some parts of notes.js are staged, it will show up in both the staging and unstaged area).

Before you commit, look at each file in the staging area by clicking on its filename. Any stray hunks make it in? Whitespace changes? Remove those lines by right-clicking and unstaging.

That extra line isn’t part of this change so it shouldn’t be part of the commit.

git gui will also show you when you have trailing whitespace:

And if you have two lines that look identical, it’s probably a whitespace issues (maybe tabs vs. spaces?).

Once you’ve fixed all that, you’re ready to describe your change…

Writing a Good Commit Message

First of all, there are a couple of semantic rules for writing good commit messages:

  • One sentence
  • In the present tense
  • Capitalized
  • No period
  • Less than 80 characters

That describes the form, but just like you can have a valid program that doesn’t do anything, you can have a valid commit message that’s useless.

So what does a good commit message look like? It should clearly summarize what the change did. Not “changed X to Y” (although that’s better than just saying “Y”, which I’ve also seen) but why X had to change to Y.

Examples of good commit messages:

Show error message on failed "edit var" in shell
Very nice “added feature”-type message.
Extra restrictions on DB name characters for Windows only
Would have been nice to have a description below the commit line describing why we needed to change this for Windows, but good “changed code”-type message.
Compile when SIGPIPE is not defined
Nice “fixed bug”-type message.
I think this is the only case where you can get away with a 1-word commit message

Examples of bad commit messages:

Add stuff
Doesn’t say what was added or why
Fix test, add input form, move text box
A commit should be one thought, this is three. Thus, this should probably be three commits, unless they’re all part of one thought you haven’t told us about.

And once you’ve committed…

When you inevitably mess up a commit and realize that you’ve accidentally committed a mishmash of ideas that break laws in six countries and are riddled with whitespace changes, check out my post on fixing git mistakes.

Or just go ahead and push.

––thursday #3: a handy git prompt

Everyone says to use git branches early and often, but I inevitably lose track of what branch I’m on. My workflow generally goes something like:

  1. Check out a branch
  2. Lunch!
  3. Get back to my desk and make an emergency bug fix
  4. Commit emergency fix
  5. Suddenly realize I’m not on the branch I meant to be on, optimally (but not always) before I try to push

To keep track, I’ve modified my command prompt to display what I’m working on using __git_ps1, which prints a nice “you are here” string for your prompt:

$ __git_ps1
$ git checkout myTag123
$ __git_ps1

(Newlines added for clarity, it actually displays as ” (master)$ git checkout…”, which is generally what you want so your prompt doesn’t contain a newline.) I’m not sure what’s up with all the ‘()’s, but it does let you distinguish between branches (‘(branchname)’) and tags (‘((tagname))’).

__git_ps1 will even show you if you’re in the middle of a bisect or a merge, which is another common workflow for me:

  1. Merge something and there are a ton of conflicts
  2. Boldly decide to deal with it tomorrow and go home for the day
  3. The next day, write and emergency patch for something
  4. Check git status
  5. Suddenly I remember that I’m in the middle of a merge
  6. Do unpleasant surgery to get my git repo back to a sane state

If you tend to fall into a similar workflow, I highly recommend modifying your prompt to something like:

$ PS1='w$(__git_ps1)> '
~/gitroot/mongo (v2.0)> 

And suddenly life is better, which is what Linux is all about.

You can run PS1=... on a per-shell basis (or export PS1 for all shells this session) or add that line to your ~/.bashrc file to get that prompt in all future shells.

Finally, __git_ps1 isn’t a binary in your path, it’s a function (so it won’t show up if you run which __git_ps1). You can see its source by running type __git_ps1.

Note: unfortunately this doesn’t work for me on OS X, so I think it might be Linux-only.


I’ve learned a lot about git, usually in a hurry after I mess up and have to fix it. Here are some basic techniques I’ve learned that may help a git beginner.

Fixing your ungood code

Let’s say you’re a clerk working for an Orwellian Ministry of Truth and you find out the new chocolate ration is 20 grams. You look at tonight’s newscast transcript, newscast.txt, which says:

...and in other news, the Ministry of Plenty has announced that chocolate 
rations will be held steady at 40 grams for the forthcoming quarter...

Well, that won’t do. We modify the transcript newscast to say:

...and in other news, the Ministry of Plenty has announced that chocolate 
rations will be cut to 20 grams for the forthcoming quarter...

And we commit:

$ git add newscast.txt
$ git commit -m "Fixed chocolate ration"
[master 9772a49] Fixed chocolate ration
 1 files changed, 1 insertions(+), 1 deletions(-)

As you’re about to push, your supervisor, who has been hovering, shrieks: “Do you want to be an unperson? Don’t say they’ve lowered a ration!”

So, we’ll modify the file again:

...and in other news, the Ministry of Plenty has announced that chocolate 
rations will be raised to 20 grams for the forthcoming quarter...

Now we’ll add this change, as though we were going to make a new commit:

$ git add newscast.txt
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#	modified:   newscast.txt

Running git commit --amend sweeps everything in our staging area into the last commit:

$ # this opens your text editor, in case you want to change the message
$ git commit --amend
[master 04ce65d] Fixed chocolate ration
 1 files changed, 1 insertions(+), 1 deletions(-)

Now you just have one commit with the “correct” changes.

We have always been at war with Eastasia

Sometimes you have a mistake so severe that you don’t even want it to exist in the repository’s history anymore. If you haven’t pushed yet, you can use git reset to unwrite any history that happened after a given commit.

Let’s say we messed something up and we want to get back to 72ecbbda47c0712846312849bab1fb458cd9fffb:

$ git reset --hard 72ecbbda47c0712846312849bab1fb458cd9fffb

And you’re back to 72ecbbda47c0712846312849bab1fb458cd9fffb, like whatever happened never actually happened.

git reset isn’t just good for revisionist history, it’s also nice when you have a merge conflict, some deleted files, a strange thing with branches, and you don’t even know what’s going on: you just want to go home. A hard reset can get you out of almost any problem.

However, once you’ve pushed, resetting isn’t an option. If you reset a public repo on a “real” project, you’ll be unpersoned so fast your head will spin.

When you’ve already pushed a crimethink

Let’s say you made a mistake (as above), but have already pushed to a remote repository. Now you have to either fix the commit or remove it.

In the example above, it’s easy enough to fix: change “cut the chocolate ration” to “raised the chocolate ration” and you’re done, so you might as well just push a new commit with the fix.

However, sometimes a fix will take longer (e.g., you need research a plausible story to explain away inconsistencies). You don’t want anyone getting confused in the meantime, so if you cannot immediately correct history, you should politely back it out. This is where git revert comes in.

First, look up the hash of the commit you want to undo.

$ git log -1 # -1 means only show the latest 1 commit
commit 72ecbbda47c0712846312849bab1fb458cd9fffb
Author: Kristina 
Date:   Thu Feb 23 09:07:58 2012 -0500

    Airplane was now invented by the Party

Now, revert that hash:

$ git revert 72ecbbda47c0712846312849bab1fb458cd9fffb

git revert will pop up your editor and you can mess with the revert message, if you want.

Changing airplane's invention date is going to take more work than anticipated

This reverts commit 72ecbbda47c0712846312849bab1fb458cd9fffb.

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#       modified:   newscast.txt

Save, exit, and push to the remote repo.

$ git revert 72ecbbda47c0712846312849bab1fb458cd9fffb
[master 84e6f13] Changing airplane's invention date is going to take more work than anticipated
 1 files changed, 0 insertions(+), 1 deletions(-)

You can revert any commit from any time in history. Note that this creates a “uncommit” in addition to the original commit: it’s essentially creating an “opposite diff” to cancel out the original commit. So the original commit is still there.

Ignorance is strength, but just in case…

Some resources you may find helpful, when learning git:

  • git gui: invaluable for staging commits
  • gitk: easier to read than git log when examining commit history
  • GitReady: most useful site I’ve found for recipes of common tasks.

And remember, use a full sentence describing your change in commit messages. Big Brother is watching.

Another comic

I finally put up another comic. I’ve been delayed for two reasons: first, I’ve been really busy with work. I’ve done some major work on the PHP driver, so it should now be easily installable on Linux and Mac (it’s also faster).

Second, I’ve been wanted to add a “one shot gag” section, which requires some fairly major rewriting of my site code, so I tried to get Git set up on Hostmonster (my hosting provider). I got it set up fine, but the money-grubbing jerks at Hostmonster block outgoing ssh unless you pay for a dedicated IP. So, I can’t put the source of my site on Github, which is sad. However, at least I have Git set up locally, so I don’t have to worry about breaking stuff.

Anyway, hope people enjoy the one-shots. More coming.

Pain in my CVS

This is pretty geeky, so sorry non-technical reader. There’s a glossary at the bottom if you’d like to follow along.

I’ve been developing a PHP database driver for work, and this week I proposed it as a new PECL (pronounced “pickle”) package. Unfortunately, they use CVS for their packages. I’m used to Git.

So, I created a cvsroot/ directory and imported my driver to it. After a couple tries, I figured that out. Then I was confused, all my files were suddenly named file,txt instead of file.txt. Then I realized that this was my master repository, and I had to check out code from there. So, I made a cvsstuff/ directory and did

$ cd cvsstuff
$ cvs co mongo

And it did the right thing! Cool.

So now I had to do it with’s remote repository. I tried checking out a couple packages to get a feel for the syntax. Then I figured out my import command, triple checked it, and was about to press enter when… hmm, where was it getting the path to the directory I was importing? I was in my home directory, so… oh, crap. So, I almost uploaded my home directory to a public repository on (for Windows users, this is roughly equivalent to uploading My Documents). My hand was hovering over the enter key, but I didn’t!

I successfully uploaded my driver, and all was well.

Non-technical person explanation:

you’ve heard of it, maybe?   It’s a programming language, like C++ or Java, except it’s usually used for making webpages.
information storage, usually can be pictured as a bunch of tables. MySQL is the most famous, my company’s is called Mongo (
Database driver
when you create a database, like our company did, you want everyone, regardless of what language they program in, to be able to use it. So you write drivers, which translate a programming language to database-speak.
PECL package
PHP has a system set up so, if someone write a useful program in PHP, it’s easy for you, halfway around the world, to download it and use it. It’s called PEAR, and it lets you type:   

$ pear install cool-package

And then you have cool-package installed on your system, too. All the packages are open source.

When you’re working on a programming project, often you’re like, “I’ll just fix this little thing here.” And then when it’s fixed, suddenly no one can log in and you can’t remember exactly what you changed and the your company crashes and burns. And that is where version control software comes in. Whenever someone makes a change, they use a program like CVS to say “I’ve changed lines 4, 6, and 23 of” Then they send this change to the central computer, who has the master code and updates it with the person’s change. If the change turns out to break stuff, there’s a record of exactly what changed and you can revert the code back to its original state. CVS is the grand-daddy of all version control, Git is another, more recent version control system.