Build, y u go slow?

yuno

When a build is taking too long, it can be very helpful to know what it’s doing. Bazel has built-in tooling that lets you visualize what each thread is doing at any given moment of a build and which build steps are slowing down your overall build.

To try out Bazel’s profiling tools, build your favorite (or, rather, least-favorite) target with the --profile option:

$ bazel build --profile=myprofile.out //snail:slow-lib

This will write the profile to a file called myprofile.out in the current directory. Once your build finishes, you can take a look at this file, but it’s not really designed to be read by humans. Instead, plug it into Bazel’s analyze-profile command:

$ bazel analyze-profile --html myprofile.out

Now you wait for Bazel to analyze the profile info and enjoy this picture of a snail I saw walking Domino the other day (penny included for scale):

Make your project build faster than this.
Make your project build faster than this.

(After taking this photo, I moved the snail onto the grass, since I’m pretty sure it was not delighted to be in the middle of a NYC sidewalk.)

Ding, analysis is probably done. Now you can open up myprofile.out.html and see your build, broken down into hundreds or thousands of individual steps. A screenshot of the output:

Screen Shot 2015-09-18 at 2.51.55 PM

I uploaded the HTML page so you can see the whole thing here and play with it (it will open in a new tab).

I used the //android target from the example app for the profile above, since it’s a little more meaty than a toy example.

The chart shows what all 200 build threads were doing during the build at any given time (one thread per row). The build is divided into several “phases” which are shown as different colored columns on the chart:

  1. The first 1.5 seconds (darkish grey) were spent initializing the build command, which means it was just parsing options and setting up the cache.
  2. The next ~1 second (green) was the loading phase, where Bazel figures out which packages it will need, downloads external dependencies, and finds and parses BUILD files.
  3. The next ~100ms (sliver of light grey) was the analyze dependencies phase, where Bazel figured out which dependencies were cached and clean and so did not need to be rebuilt.
  4. Finally, Bazel moved into the build phase (pink background), actually building all of the things that needed to be built.

There are several other phases that you can see on the legend, but they are, for the most part, too short to even be visible on the chart.

Below the chart, there’s a section for “Execution phase.” “Execution” can be a little confusing in this context: it’s referring to executing the build, not running your program. The execution phase maps to the pink phase in the chart above. In this section is a sub-section called “Critical path, ” which breaks down what your build was waiting on:

Critical path (13.339 s):
    Id        Time Percentage   Description
  6722     48.1 ms    0.36%   Zipaligning apk
  6721      344 ms    2.58%   Generating signed apk
  6720      540 ms    4.05%   Converting bazel-out/local_darwin-fastbuild/bin/android/android_deploy.jar to dex format
  6719      230 ms    1.73%   Building deploy jar android/android_deploy.jar
  6718     1.051 s    7.88%   Building android/libandroid.jar (0 files)
  6717     15.7 ms    0.12%   Extracting interface //android activities
  6716     1.744 s   13.07%   Building android/libactivities.jar (1 files)
  6715      785 ms    5.89%   Processing resources
  6712     3.737 s   28.01%   Building external/default_android_tools/src/tools/android/java/com/google/devtools/build/android/libandroid_builder_lib.jar (17 files) [for host]
  6711     4.843 s   36.30%   Writing file external/default_android_tools/src/tools/android/java/com/google/devtools/build/android/libandroid_builder_lib.jar-2.params [for host]
           1.73 ms    0.01%   [2 middleman actions]

As you can see, the build was “blocked” for more than 8 seconds building libandroid_builder_lib.jar and its params file. Luckily, these files shouldn’t change between builds (unless you update your Android SDK, which shouldn’t happen between every build). If I make a change to libactivities.jar (the actual meat of the program) and rebuild, I get:

$ bazel build --profile profile2 //android
INFO: Writing profile data to '/Users/kchodorow/gitroot/examples/tutorial/profile2'
INFO: Found 1 target...
INFO: From Generating unsigned apk:

THIS TOOL IS DEPRECATED. See --help for more information.

INFO: From Generating signed apk:

THIS TOOL IS DEPRECATED. See --help for more information.

Target //android:android up-to-date:
  bazel-bin/android/android_deploy.jar
  bazel-bin/android/android_unsigned.apk
  bazel-bin/android/android.apk
INFO: Elapsed time: 4.545s, Critical Path: 2.72s

Here is the new profile page for this incremental build.

Note that the critical path is only 2.718 seconds now, not 13.339! If we look at the profile, we can see that the new critical path is much more svelte:

Critical path (2.718 s):
    Id        Time Percentage   Description
   543     52.3 ms    1.92%   Zipaligning apk
   542      342 ms   12.59%   Generating signed apk
   541      577 ms   21.22%   Converting bazel-out/local_darwin-fastbuild/bin/android/android_deploy.jar to dex format
   540      245 ms    9.00%   Building deploy jar android/android_deploy.jar
   539     1.502 s   55.26%   Building android/libactivities.jar (1 files)

Now building libactivities.jar is the most heavyweight operation on the critical path, so we could tackle that by perhaps breaking it into separate libraries that don’t all have to be recompiled every time something changes.

The profiles Bazel generates can be… dense… so feel free to ask on the mailing list if you need any help interpreting them. Also, if you’re interested in more on the subject, check out the documentation on profiling.

Oh, the Mistakes I’ve Seen

A slow database is easily fixed
If you make good choices of fields indexed.
Sometimes the answer is simpler still,
A quick code change may fit the bill.

I’ll be giving an O’Reilly webcast, Scaling with MongoDB, on Friday (9/17). Please sign up if you’re interested in learning some more advanced optimization than what this post gets into. This webcast is, in part, to pimp MongoDB: The Definitive Guide, which will be coming out next week!

These are a few basic tips on making your application better/faster/stronger without knowing anything about indexes or sharding.

Connecting

Connecting to the database is a (relatively) expensive operation. Try to minimize the number of times you connect and disconnect: use persistent connections or connection pooling (depending on your language).

To not waste connections, you have to know what your driver is doing. I see a lot of code like this in PHP:

$connection = new Mongo();
$connection->connect();

What this does is:

  1. The constructor connects to the database.
  2. connect() sees that you’re already connected, assumes you want to reset the connection.
  3. Disconnects from the database.
  4. Connects again.

Gah! You just doubled your execution time.

ObjectIds

ObjectIds seem to make people vaguely uncomfortable, so they convert their ObjectIds into strings (the macaroni and cheese of data types). The problem is, an ObjectId takes up 12 bytes but its string representation takes up 29 bytes (almost two and a half times bigger). The lesson: suck it up and eat your spinachy ObjectIds. You’ll learn to like ’em.

Also, an ObjectId won’t sneakily convert itself into a string on the fly. I see a lot of code like:

id = new ObjectId();
db.foo.insert({"_id" : new ObjectId(id)});
// or, even sillier
db.foo.insert({"_id" : new ObjectId(id.toString())});

If you created an ObjectId and haven’t messed with it, it’s still an ObjectId.

Numbers vs. Strings

MongoDB is type-sensitive and it’s important to use the correct type: numbers for numeric values and strings for strings.

If you have large numbers and you save them as strings (“1234567890” instead of 1234567890), MongoDB may slow down as it strcmps the entire length of the number instead of doing a quicker numeric comparison. Also, “12” is going to be sorted as less than “9”, because MongoDB will use string, not numeric, comparison on the values. This can lead to some surprising results.

Driver-specific

Find out if you’re driver is particularly weaknesses (or strengths). For instance, the Perl driver is one of the fastest drivers, but it sucks at decoding Date types (Perl’s DateTime objects take a long time to create). So, if you want fast Perl programs, avoid dates like the plague or you’ll be puttering along with the Ruby programmers. (Just kidding, Rubyists! Sort of.)

The most important thing is to get to know your language’s documentation and ask if you have any questions.