Keeping your deps tidy

My coworker Carmi just published a blog post on the Bazel blog about how Java tracks dependencies. Bazel has some nice facilities built in to let you know when you need to add dependencies:

ERROR: /home/kchodorow/test/a/BUILD:24:1: Building libA.jar (1 source file) failed: Worker process sent response with exit code: 1.
A.java:6: error: [strict] Using type C from an indirect dependency (TOOL_INFO: "//:C"). See command below **
  C getC() {
  ^
** Please add the following dependencies:
  //:C  to //:A

He mentioned unused_deps, a great tool to go the other way: what if your BUILD file declares dependencies you’re not using? unused_deps lets you quickly clean up your BUILD files.

To get unused_deps, clone the buildtools repository and build it:

$ git clone git@github.com:bazelbuild/buildtools.git
$ cd buildtools
$ bazel build //unused_deps

Now go to your project and run it:

$ cd ~/my-project
$ ~/gitroot/buildtools/bazel-bin/unused_deps/unused_deps //... > buildozer-cmds.sh

This will print a bunch of info to stderr as it runs but, when it’s done, you should have a list of buildozer commands in buildozer-cmds.sh. For example, running this on the Bazel codebase yields:

buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:auth_and_tls_options' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:build-base' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:concurrent' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
buildozer 'remove deps //src/main/java/com/google/devtools/build/lib:events' //src/tools/remote_worker/src/main/java/com/google/devtools/build/remote:remote
...

This is a list of shell commands, so now you need to execute this file. The buildozer tool also lives in the buildtools repository, so you just have to build that and then add it to your path:

$ cd ~/gitroot/buildtools
$ bazel build //buildozer
$ cd ~/my-project
$ chmod +x buildozer-cmds.sh
$ PATH=$HOME/gitroot/buildtools/bazel-bin/buildozer:$PATH ./buildozer-cmds.sh

This will run all of the buildozer commands and then you can commit the changes, e.g.,

$ git diff
diff --git a/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD b/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD
index 022a4037d..5d5cdf8d0 100644
--- a/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD
+++ b/src/tools/benchmark/javatests/com/google/devtools/build/benchmark/codegenerator/BUILD
@@ -6,7 +6,6 @@ java_test(
     deps = [
         "//src/tools/benchmark/java/com/google/devtools/build/benchmark/codegenerator:codegenerator_lib",
         "//third_party:guava",
-        "//third_party:junit4",
         "//third_party:truth",
     ],
 )
@@ -17,7 +16,6 @@ java_test(
     deps = [
         "//src/tools/benchmark/java/com/google/devtools/build/benchmark/codegenerator:codegenerator_lib",
         "//third_party:guava",
-        "//third_party:junit4",
         "//third_party:truth",
     ],
 )
@@ -39,7 +37,6 @@ java_test(
     deps = [
         "//src/tools/benchmark/java/com/google/devtools/build/benchmark/codegenerator:codegenerator_lib",
         "//third_party:guava",
-        "//third_party:junit4",
         "//third_party:truth",
     ],
 )
@@ -50,7 +47,6 @@ java_test(
     deps = [
         "//src/main/java/com/google/devtools/common/options",
         "//src/tools/benchmark/java/com/google/devtools/build/benchmark/codegenerator:codegenerator_lib",
-        "//third_party:junit4",
         "//third_party:truth",
     ],
 )
...

It’s a good idea to run unused_deps regularly to keep things tidy. For example, the Bazel project does not run it automatically and has nearly 1000 unneeded deps (oops). You might want to add a git hook or something to your CI to run for every change.

Messy closet vs. clean closet
unused_deps: the Container Store of build tools.

How to Skylark – the class

I’ve heard a lot of users say they want a more comprehensive introduction to writing build extensions for Bazel (aka, Skylark). One of my friends has been working on Google Classroom and they just launched, so I created a build extensions crash course. I haven’t written much content yet (and I don’t understand exactly how Classroom works), but we can learn together! If you’re interested:

It’s free and you can get in on the ground floor of… whatever this is. If you’ve enjoyed/found useful my posts on Skylark, this should be a more serious business and well-structured look at the subject.

I’ll try to release content at least once a week until we’ve gotten through all the material that seems sensible, I get bored, or people stop “attending.”

Low-fat Skylark rules – saving memory with depsets

In my previous post on aspects, I used a Bazel aspect to generate a simple Makefile for a project. In particular, I passed a list of .o files up the tree like so:

  dotos = [ctx.label.name + ".o"]
  for dep in ctx.rule.attr.deps:
    # Create a new array by concatenating this .o with all previous .o's.
    dotos += dep.dotos

  return struct(dotos = dotos)

In a toy example, this works fine. However, in a real project, we might have tens of thousands of .o files across the build tree. Every cc_library would create a new array and copy every .o file into it, only to move up the tree and make another copy. It’s very inefficient.

Enter nested sets. Basically, you can create a set with pointers to other sets, which isn’t inflated until its needed. Thus, you can build up a set of dependencies using minimal memory.

To use nested sets instead of arrays in the previous example, replace the lists in the code with depset and |:

  dotos = depset([ctx.label.name + ".o"])
  for dep in ctx.rule.attr.deps:
    dotos = dotos | dep.dotos

Nested sets use | for union-ing two sets together.

“Set” isn’t a great name for this structure (IMO), since they’re actually trees and, if you think of them as sets, you’ll be very confused about their ordering if you try to iterate over them.

For example, let’s say you have the following macro in a .bzl file:

def order_test():
  srcs = depset(["src1", "src2"])
  first_deps = depset(["dep1", "dep2"])
  second_deps = depset(["dep3", "dep4"])
  src_and_deps = srcs | first_deps
  everything = second_deps | src_and_deps

  for item in everything:
    print(item)

Now call this from a BUILD file:

load('//:playground.bzl', 'order_test')
order_test()

And “build” the BUILD file to run the function:

$ bazel build //:BUILD
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep1.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep2.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: src1.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: src2.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep3.
WARNING: /usr/local/google/home/kchodorow/test/a/playground.bzl:7:5: dep4.

How did that code end up generating that ordering? We start off with one set containing src1 and src2:

Add the first deps:

And then create a deps set and add the tree we previously created to it:

Then the iterator does a postorder traversal.

This is just the default ordering, you can specify a different ordering. See the docs for more info on depset.

Aspects: the fan-fic of build rules

Aspects are a feature of Bazel that are basically like fan-fic, if build rules were stories: aspects let you add features that require intimate knowledge of the build graph, but that that the rule maintainer would never want to add.

For example, let’s say we want to be able to generate Makefiles from a Bazel project’s C++ targets. Bazel isn’t going to add support for this to the built-in C++ rules. However, lots of projects might want to support a couple of build systems, so it would be nice to be able to automatically generate build files for Make. So let’s say we have a simple Bazel C++ project with a couple of rules in the BUILD file:

cc_library(
    name = "lib",
    srcs = ["lib.cc"],
    hdrs = ["lib.h"],
)

cc_binary(
    name = "bin",
    srcs = ["bin.cc"],
    deps = [":lib"],
)

We can use aspects to piggyback on Bazel’s C++ rules and generate new outputs (Makefiles) from them. It’ll take each Bazel C++ rule and generate a .o-file make target for it. For the cc_binary, it’ll link all of the .o files together. Basically, we’ll end up with a Makefile containing:

bin : bin.o lib.o
	g++ -o bin bin.o lib.o

bin.o : bin.cc
	g++ -c bin.cc

lib.o : lib.cc
	g++ -c lib.cc

(If you have any suggestions about how to make this better, please let me know in the comments, I’m definitely not an expert on Makefiles and just wanted something super-simple.) I’m assuming a basic knowledge of Bazel and Skylark (e.g., you’ve written a Skylark macro before).

Create a .bzl file to hold your aspect. I’ll call mine make.bzl. Add the aspect definition:

makefile = aspect(
    implementation = _impl,
    attr_aspects = ["deps"],
)

This means that the aspect will follow the “deps” attribute to traverse the build graph. We’ll invoke it on //:bin, and it’ll follow //:bin‘s dep to //:lib. The aspect’s implementation will be run on both of these targets.

Add the _impl function. We’ll start by just generating a hard-coded Makefile:

def _impl(target, ctx):
  # If this is a cc_binary, generate the actual Makefile.
  outputs = []
  if ctx.rule.kind == "cc_binary":
    output = ctx.new_file("Makefile")
    content = "bin : bin.cc lib.cc lib.hntg++ -o bin bin.cc lib.ccn"
    ctx.file_action(content = content, output = output)
    outputs = [output]

  return struct(output_groups = {"makefiles" : set(outputs)})

Now we can run this:

$ bazel build //:bin --aspects make.bzl%makefile --output_groups=makefiles
INFO: Found 1 target...
INFO: Elapsed time: 0.901s, Critical Path: 0.00s
$

Bazel doesn’t print anything, but it has generated bazel-bin/Makefile. Let’s create a symlink to it in our main directory, since we’ll keep regenerating it and trying it out:

$ ln -s bazel-bin/Makefile Makefile 
$ make
g++ -o bin bin.cc lib.cc
$

The Makefile works, but is totally hard-coded. To make it more dynamic, first we’ll make the aspect generate a .o target for each Bazel rule. For this, we need to look at the sources and propagate that info up.

The base case is:

  source_list= [f.path for src in ctx.rule.attr.srcs for f in src.files]
  cmd = target.label.name + ".o : {sources}ntg++ -c {sources}".format(
      sources = " ".join(source_list)
  )

Basically: run g++ on all of the srcs for a target. You can add a print(cmd) to see what cmd ends up looking like. (Note: We should probably do something with headers and include paths here, too, but I’m trying to keep things simple and it isn’t necessary for this example.)

Now we want to collect this command, plus all of the commands we’ve gotten from any dependencies (since this aspect will have already run on them):

  transitive_cmds = [cmd]
  for dep in ctx.rule.attr.deps:
    transitive_cmds += dep.cmds

Finally, at the end of the function, we’ll return this whole list of commands, so that rules “higher up” in the tree have deps with a “cmds” attribute:

  return struct(
      output_groups = {"makefiles" : set(outputs)},
      cmds = transitive_cmds,
  )

Now we can change our output file to use this list:

    ctx.file_action(
        content = "nn".join(transitive_cmds) + "n",
        output = output
    )

Altogether, our aspect implementation now looks like:

def _impl(target, ctx):
  source_list= [f.path for src in ctx.rule.attr.srcs for f in src.files]
  cmd = target.label.name + ".o : {sources}ntg++ -c {sources}".format(
      sources = " ".join(source_list)
  )

  # Collect all of the previously generated Makefile targets.                                                                                                                                                                                                                                                                  
  transitive_cmds = [cmd]
  for dep in ctx.rule.attr.deps:
    transitive_cmds += dep.cmds

  # If this is a cc_binary, generate the actual Makefile.                                                                                                                                                                                                                                                                      
  outputs = []
  if ctx.rule.kind == "cc_binary":
    output = ctx.new_file("Makefile")
    ctx.file_action(
        content = "nn".join(transitive_cmds) + "n",
        output = output
    )
    outputs = [output]

  return struct(
      output_groups = {"makefiles" : set(outputs)},
      cmds = transitive_cmds,
  )

If we run this, we get the following Makefile:

bin.o : bin.cc
	g++ -c bin.cc

lib.o : lib.cc
	g++ -c lib.cc

Getting closer!

Now we need the last “bin” target to be automatically generated, so we need to keep track of all the intermediate .o files we’re going to link together. To do this, we’ll add a “dotos” list that this aspect propagates up the deps.

This is similar to the transitive_cmds list, so add a couple lines to our deps traversal function:

  # Collect all of the previously generated Makefile targets.                                                                                                                                                                                                                                                                  
  dotos = [ctx.label.name + ".o"]
  transitive_cmds = [cmd]
  for dep in ctx.rule.attr.deps:
    dotos += dep.dotos
    transitive_cmds += dep.cmds

Now propagate them up the tree:

  return struct(
      output_groups = {"makefiles" : set(outputs)},
      cmds = transitive_cmds,
      dotos = dotos,
  )

And finally, add binary target to the Makefile:

  # If this is a cc_binary, generate the actual Makefile.                                                                                                                                                                                                                                                                      
  outputs = []
  if ctx.rule.kind == "cc_binary":
    output = ctx.new_file("Makefile")
    content = "{binary} : {dotos}ntg++ -o {binary} {dotos}nn{deps}n".format(
        binary = target.label.name,
        dotos = " ".join(dotos),
        deps = "nn".join(transitive_cmds)
    )
    ctx.file_action(content = content, output = output)
    outputs = [output]

If we run this, we get:

bin : bin.o lib.o
	g++ -o bin bin.o lib.o

bin.o : bin.cc
	g++ -c bin.cc

lib.o : lib.cc
	g++ -c lib.cc

Documentation about aspects can be found on bazel.io. Like Skylark rules, I find aspects a little difficult to read because they are inherently recursive functions, but it helps to break it down (and use lots of prints).

Using AutoValue with Bazel

AutoValue is a really handy library to eliminate boilerplate in your Java code. Basically, if you have a “plain old Java object” with some fields, there are all sorts of things you need to do to make it work “good,” e.g., implement equals and hashCode to use it in collections, make all of its fields final (and optimally immutable), make the fields private and accessed through getters, etc. AutoValue generates all of that for you.

To get AutoValue work with Bazel, I ended up modifying cushon’s example. There were a couple of things I didn’t like about it, since I didn’t want the AutoValue library to live in my project’s BUILD files. I set it up so it was defined in the AutoValue project, so I figured I’d share what I came up with.

In your WORKSPACE file, add a new_http_archive for the AutoValue jar. I’m using the one in Maven, but not using maven_jar because I want to override the BUILD file to provide AutoValue as both a Java library and a Java plugin:

new_http_archive(
    name = "auto_value",
    url = "http://repo1.maven.org/maven2/com/google/auto/value/auto-value/1.3/auto-value-1.3.jar",
    build_file_content = """
java_import(
    name = "jar",
    jars = ["auto-value-1.3.jar"],
)

java_plugin(
    name = "autovalue-plugin",
    generates_api = 1,
    processor_class = "com.google.auto.value.processor.AutoValueProcessor",
    deps = [":jar"],
)

java_library(
    name = "processor",
    exported_plugins = [":autovalue-plugin"],
    exports = [":jar"],
    visibility = ["//visibility:public"],
)
""",
)

Then you can depend on @auto_value//:processor in any java_library target:

java_library(
    name = "project",
    srcs = ["Project.java"],
    deps = ["@auto_value//:processor"],
)

…and Bob’s your uncle.

The Mixed-Up Directories of Mrs. Bazel E. Frankweiler

Bazel has several directory trees that it uses during a build.

Sources

The most obvious directory is the source tree where your code lives and where you run your builds. This is, by default, what Bazel uses for source files.

However, you can combine several source trees by using the --package_path option. This basically overlays them, from Bazel’s point of view. For instance, if you had:

/
  home/
    user/
      gitroot/
        my-project/
          WORKSPACE
          BUILD
          foo/
  usr/
    local/
      other-proj/
        WORKSPACE
        BUILD
        bar/
           BUILD

Then if you ran bazel build --package_path=/home/user/gitroot/my-project:/usr/local/other-proj //..., Bazel would “see” the directory tree:

WORKSPACE
BUILD
  foo/
  bar/
    BUILD

If a package is defined in multiple package paths, the first path “wins” (e.g., the top-level package containing foo/ above).

Finally, source code for external repositories is tucked away in Bazel’s output base. You can see it by running:

$ ls $(bazel info output_base)/external

Execution root

Once bazel figures out what packages a build is going to use, it creates a symlink tree called the execution root, where the build actually happens.* It basically traverses all of the packages it found and comes up with the most efficient way it can to symlink them together. For example, in the directory tree above, you’d end up with:

execroot/
  my-project/
    WORKSPACE -> /home/user/gitroot/my-project/WORKSPACE
    BUILD -> /home/user/gitroot/my-project/BUILD
    foo/ -> /home/user/gitroot/my-project/foo
    bar/ -> /usr/local/other-proj/bar
  bazel_tools/
    ... # Tools built into the Bazel binary
  local_config_cc/
    ... # C++ compiler tools

You can check out the execution root by running:

$ ls $(bazel info execution_root)

Sandbox

Here’s the * from the execution root section above! If you’re on Linux (and, hopefully, OS X soon), your build actually takes place in a sandbox based on the execution root. All of the files your build needs (and hopefully none it doesn’t) are mounted into their own namespace and executed in a hermetically sealed environment: no network or filesystem access (other than what you specified).

You can see what’s being mounted and where by running Bazel with a couple extra flags:

$ bazel build --sandbox_debug --verbose_failures //...

Derived roots

We’re working on improving how configurable Bazel is, but for years the main configuration options have been the platform you’re building for and what your compiler options were. If you run an optimized build, then a debug build, and then an optimized build again, you’d like your results to be cached from the first run, not overwritten each time. In a somewhat questionable design move, Bazel uses a special set of output directories that are named based on the configuration. So if you build an optimized binary, it’ll create it under execroot/my-project/bazel-out/local-opt/bin/my-binary. If you then build it as a debug binary, it’ll put it under execroot/my-project/bazel-out/local-dbg/bin/my-binary. Then, if you build it optimized again, it’ll be able to switch back to using the local-opt directory. (However, bazel uses symlinks out the wazoo, so I don’t know why it doesn’t use symlinks to track which configuration is being used. Seems like it’d be a lot easier to have the outputs directly under execroot/my-project.)

Also, Bazel distinguishes between files created by genrule and… everything else. Almost all output is under bazel-out/config/bin, but genrules are under bazel-out/config/genfiles.

(I’m kind of bitter about these files, I’ve been working on a change for what feels like months because of this stupid directory structure.)

Note that Bazel symlinks execroot/ws/bazel-out/config/bin to bazel-bin and execroot/ws/bazel-out/config/genfiles to bazel-genfiles. These “convenience symlinks” are the outputs shown at the end of your build.

Runfiles

Suppose you build a binary in your favorite language. When you run that binary, it tries to load a file during runtime. Bazel encourages you to declare these files as runfiles, runtime dependencies of your build. If they change, your binary won’t be recompiled (because they’re runtime, not compile-time dependencies) but they will cause tests to be re-run.

Bazel create a directory for these files as a sibling to your binary: if you build //foo:my-binary, the runfiles will be under bazel-bin/foo/my-binary.runfiles. You can explore the directory or see a list of them all in the runfiles manifest, also a sibling of the binary:

$ cat bazel-bin/foo/my-project.runfiles_manifest

Note that a binary can run files from anywhere on the filesystem (they’re binaries, after all). We just recommend using runfiles so that you can keep them together and express them as build dependencies.

Custom, locally-sourced output filenames

Skylark lets you use templates in your output file name, e.g., this would create a file called target.timestamp:

touch = rule(
    outputs = {"date_and_time": "%{name}.timestamp"},
    implementation = _impl,
)

So if you had touch(name = "foo") in a BUILD file and built :foo, you’d get foo.timestamp.

I’d always used %{name}, but I found out the other day that you can actually use other attributes, too. For example, you could have:

greet = rule(
    attrs = {"my_name": attr.string()},
    outputs = {"greeting": "hi-there-%{my_name}"},
    implementation = _impl,
)

Then if you have greet(name = "a-greeting", my_name = "kristina") and build :a-greeting, you’ll get “hi-there-kristina” as an output file.

The entire source for this example is available as a GitHub gist (all four lines of implementation function not shown above).

Using environment variables in Skylark repository rules

If you’ve every used the AppEngine rules, you know the pain that is wait for all 200 stupid megabytes of API to be downloaded. The pain is doubled because I already have a copy of these rules on my workstation.

To use the local rules, all I have to do is override the @com_google_appengine_java repository in my WORKSPACE file, like so:

load("//appengine:appengine.bzl", "APPENGINE_BUILD_FILE")
new_local_repository(
    name = "com_google_appengine_java",
    path = "/Users/kchodorow/Downloads",
    build_file_content = APPENGINE_BUILD_FILE,
)

However, this is still imperfect: I don’t really want to maintain changes that basically amount to a performance optimization in my local client.

By using environment variables in the appengine_repository rule, we can do even better. I’m going to create a new rule that checks if the APPENGINE_SDK_PATH environment variable is set. If it is, it will use a local_repository to pull in AppEngine, otherwise it will fall back on downloading the .zip.

So, to start, let’s take a look at the existing rule that pulls in the AppEngine SDK. As of this writing, it looks like this:

  native.new_http_archive(
      name = "com_google_appengine_java",
      url = "http://central.maven.org/maven2/com/google/appengine/appengine-java-sdk/%s/%s.zip" % (APPENGINE_VERSION, APPENGINE_DIR),
      sha256 = "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4",
      build_file_content = APPENGINE_BUILD_FILE,
  )

First, let’s modify this to use a repository rule instead of native.maven_jar:

def _find_locally_or_download_impl(repository_ctx):
   repository_ctx.download_and_extract(
     "http://central.maven.org/maven2/com/google/appengine/appengine-java-sdk/%s/%s.zip" % (APPENGINE_VERSION, APPENGINE_DIR),
     ".", "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4", "", "")
  repository_ctx.file("BUILD", APPENGINE_BUILD_FILE)

_find_locally_or_download = repository_rule(
  implementation = _find_locally_or_download_impl,
  local = False,
)

def appengine_repositories():
  _find_locally_or_download(name = "com_google_appengine_java")

This code functions identically (basically) to the original code, so now let’s add an option for using a local path. Modify in the implementation function to check the environment:

def _find_locally_or_download_impl(repository_ctx):
  if 'APPENGINE_SDK_PATH' in repository_ctx.os.environ:
    path = repository_ctx.os.environ['APPENGINE_SDK_PATH']
    if path == "":
      fail("APPENGINE_SDK_PATH set, but empty")
    repository_ctx.symlink(path, APPENGINE_DIR)
  else:
   repository_ctx.download_and_extract(
     "http://central.maven.org/maven2/com/google/appengine/appengine-java-sdk/%s/%s.zip" % (APPENGINE_VERSION, APPENGINE_DIR),
     ".", "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4", "", "")
  repository_ctx.file("BUILD", APPENGINE_BUILD_FILE)

Now we can download a copy of the SDK and try our rule (feel free to use an existing copy, if you have one on your system).

APPENGINE_SDK_PATH=/path/to/your/sdk/download bazel build //your/appengine/app

Problems with this:

  • You can’t actually set APPENGINE_SDK_PATH to where Bazel downlaoded the SDK the first time around ($(bazel info output_base)/external/com_google_appengine_java), which is suuuuper tempting to do. If you do, Bazel will delete the downloaded copy (because you changed the repository def) and then symlink the empty directory to itself. Never what you want.
  • It caches the environment variable, so if you change your mind you have to run bazel clean to use a different APPENGINE_SDK_PATH. I think this is a bug, although there’s some debate about that.

Resting BUILD face

I am super excited that pmbethe09 and lautentlb just put in a bunch of extra work to open source Buildifier. Buildifier is a great tool we use in Google to format BUILD files. It automatically organizes attributes, corrects indentation, and generally makes them more readable and excellent.

To try it out, clone the repo and build it with Bazel:

$ git clone git@github.com:bazelbuild/buildifier.git
$ cd buildifier
$ bazel build //buildifier
Extracting Bazel installation...
...

INFO: Found 1 target...
Target //buildifier:buildifier up-to-date:
  bazel-bin/buildifier/buildifier.a
  bazel-bin/buildifier/buildifier

INFO: Elapsed time: 203.309s, Critical Path: 7.54s
INFO: Build completed successfully, 8 total actions

Now try it out on an ugly BUILD file:

$ echo 'cc_library(srcs = ["foo.cc", "bar.cc"], name = "foo")' > BUILD
$ ~/gitroot/buildifier/bazel-bin/buildifier/buildifier BUILD
$ cat BUILD
cc_library(
    name = "foo",
    srcs = [
        "bar.cc",
        "foo.cc",
    ],
)

Finally, why run commands manually when you can have your editor do it for you? I use emacs, so I can set up a hook like this:

(add-hook 'after-save-hook
          (lambda()
            (if (string-match "BUILD" (file-name-base (buffer-file-name)))
                (progn
                  (shell-command (concat "/path/to/buildifier/bazel-bin/buildifier/buildifier " (buffer-file-name)))
                  (find-alternate-file (buffer-file-name))))))

You could also set up a git hook to run this before committing, if that’s more your style. Regardless, give it a try! It’s a quick, easy way to make your BUILD files more readable.

Communicating between Bazel rules: how to use Skylark providers

Rules in Bazel often need information from their dependencies. My previous post touched on a special case of this: figuring out what a dependency’s runfiles are. However, Skylark is actually capable of passing arbitrary information between rules using a system known as providers.

Suppose we have a rule, analyze_flavors, that figures out what all of the flavors are in a dish. Our build file looks like:

load(":food.bzl", "analyze_flavors")

analyze_flavors(
    name = "burger",
    ingredients = [
        ":beef",
        ":ketchup",
    ],
)

analyze_flavors(
    name = "beef",
    tastes_like = "umame",
)

analyze_flavors(
    name = "ketchup",
    tastes_like = "sweet",
)

We want to build up a flavor profile for :burger, based on its ingredients.

To do this, food.bzl looks like:

def _flavor_impl(ctx):
  # Build up a flavor profile from this rule & its ingredients.
  flavor_profile = []
  for ingredient in ctx.attr.ingredients:
    if ingredient.flavor != None:
      flavor_profile += ingredient.flavor

  if ctx.attr.tastes_like != "":
    flavor_profile += [ctx.attr.tastes_like]

  # Write the list of flavors to a file.
  ctx.file_action(
    output = ctx.outputs.out,
    content = "%s tastes like %sn" % (
        ctx.label.name, " and ".join(flavor_profile))
  )

  # Return the list of flavors so it can be used by rules that depend on this.
  return struct(flavor = flavor_profile)

analyze_flavors = rule(
    attrs = {
        "ingredients": attr.label_list(),
        "tastes_like": attr.string(),
    },
    outputs = {"out": "flavors-of-%{name}"},
    implementation = _flavor_impl,
)

The highlighted lines are where the rule returns a provider, flavor, to be consumed by its reverse dependencies (the targets depending on it).

Our BUILD file gives us the following build graph:

graph

:burger depends on :beef and :ketchup. :beef and :ketchup each provide :burger with a flavor. Thus, if we build :burger and check its output file, we get:

$ bazel build :burger

INFO: Found 1 target...
Target //:burger up-to-date:
  bazel-bin/flavors-of-burger

INFO: Elapsed time: 0.270s, Critical Path: 0.00s
INFO: Build completed successfully, 2 total actions
$ cat bazel-bin/flavors-of-burger
burger tastes like umame and sweet

This can be used to communicate rich information from rule-to-rule in Skylark. See the Skylark cookbook for another example of providers.