Custom, locally-sourced output filenames

Skylark lets you use templates in your output file name, e.g., this would create a file called target.timestamp:

touch = rule(
    outputs = {"date_and_time": "%{name}.timestamp"},
    implementation = _impl,
)

So if you had touch(name = "foo") in a BUILD file and built :foo, you’d get foo.timestamp.

I’d always used %{name}, but I found out the other day that you can actually use other attributes, too. For example, you could have:

greet = rule(
    attrs = {"my_name": attr.string()},
    outputs = {"greeting": "hi-there-%{my_name}"},
    implementation = _impl,
)

Then if you have greet(name = "a-greeting", my_name = "kristina") and build :a-greeting, you’ll get “hi-there-kristina” as an output file.

The entire source for this example is available as a GitHub gist (all four lines of implementation function not shown above).

Using environment variables in Skylark repository rules

If you’ve every used the AppEngine rules, you know the pain that is wait for all 200 stupid megabytes of API to be downloaded. The pain is doubled because I already have a copy of these rules on my workstation.

To use the local rules, all I have to do is override the @com_google_appengine_java repository in my WORKSPACE file, like so:

load("//appengine:appengine.bzl", "APPENGINE_BUILD_FILE")
new_local_repository(
    name = "com_google_appengine_java",
    path = "/Users/kchodorow/Downloads",
    build_file_content = APPENGINE_BUILD_FILE,
)

However, this is still imperfect: I don’t really want to maintain changes that basically amount to a performance optimization in my local client.

By using environment variables in the appengine_repository rule, we can do even better. I’m going to create a new rule that checks if the APPENGINE_SDK_PATH environment variable is set. If it is, it will use a local_repository to pull in AppEngine, otherwise it will fall back on downloading the .zip.

So, to start, let’s take a look at the existing rule that pulls in the AppEngine SDK. As of this writing, it looks like this:

  native.new_http_archive(
      name = "com_google_appengine_java",
      url = "http://central.maven.org/maven2/com/google/appengine/appengine-java-sdk/%s/%s.zip" % (APPENGINE_VERSION, APPENGINE_DIR),
      sha256 = "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4",
      build_file_content = APPENGINE_BUILD_FILE,
  )

First, let’s modify this to use a repository rule instead of native.maven_jar:

def _find_locally_or_download_impl(repository_ctx):
   repository_ctx.download_and_extract(
     "http://central.maven.org/maven2/com/google/appengine/appengine-java-sdk/%s/%s.zip" % (APPENGINE_VERSION, APPENGINE_DIR),
     ".", "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4", "", "")
  repository_ctx.file("BUILD", APPENGINE_BUILD_FILE)

_find_locally_or_download = repository_rule(
  implementation = _find_locally_or_download_impl,
  local = False,
)

def appengine_repositories():
  _find_locally_or_download(name = "com_google_appengine_java")

This code functions identically (basically) to the original code, so now let’s add an option for using a local path. Modify in the implementation function to check the environment:

def _find_locally_or_download_impl(repository_ctx):
  if 'APPENGINE_SDK_PATH' in repository_ctx.os.environ:
    path = repository_ctx.os.environ['APPENGINE_SDK_PATH']
    if path == "":
      fail("APPENGINE_SDK_PATH set, but empty")
    repository_ctx.symlink(path, APPENGINE_DIR)
  else:
   repository_ctx.download_and_extract(
     "http://central.maven.org/maven2/com/google/appengine/appengine-java-sdk/%s/%s.zip" % (APPENGINE_VERSION, APPENGINE_DIR),
     ".", "189ec08943f6d09e4a30c6f86382a9d15b61226f042ee4b7c066b2466fd980c4", "", "")
  repository_ctx.file("BUILD", APPENGINE_BUILD_FILE)

Now we can download a copy of the SDK and try our rule (feel free to use an existing copy, if you have one on your system).

APPENGINE_SDK_PATH=/path/to/your/sdk/download bazel build //your/appengine/app

Problems with this:

  • You can’t actually set APPENGINE_SDK_PATH to where Bazel downlaoded the SDK the first time around ($(bazel info output_base)/external/com_google_appengine_java), which is suuuuper tempting to do. If you do, Bazel will delete the downloaded copy (because you changed the repository def) and then symlink the empty directory to itself. Never what you want.
  • It caches the environment variable, so if you change your mind you have to run bazel clean to use a different APPENGINE_SDK_PATH. I think this is a bug, although there’s some debate about that.

Resting BUILD face

I am super excited that pmbethe09 and lautentlb just put in a bunch of extra work to open source Buildifier. Buildifier is a great tool we use in Google to format BUILD files. It automatically organizes attributes, corrects indentation, and generally makes them more readable and excellent.

To try it out, clone the repo and build it with Bazel:

$ git clone git@github.com:bazelbuild/buildifier.git
$ cd buildifier
$ bazel build //buildifier
Extracting Bazel installation...
...

INFO: Found 1 target...
Target //buildifier:buildifier up-to-date:
  bazel-bin/buildifier/buildifier.a
  bazel-bin/buildifier/buildifier

INFO: Elapsed time: 203.309s, Critical Path: 7.54s
INFO: Build completed successfully, 8 total actions

Now try it out on an ugly BUILD file:

$ echo 'cc_library(srcs = ["foo.cc", "bar.cc"], name = "foo")' > BUILD
$ ~/gitroot/buildifier/bazel-bin/buildifier/buildifier BUILD
$ cat BUILD
cc_library(
    name = "foo",
    srcs = [
        "bar.cc",
        "foo.cc",
    ],
)

Finally, why run commands manually when you can have your editor do it for you? I use emacs, so I can set up a hook like this:

(add-hook 'after-save-hook
          (lambda()
            (if (string-match "BUILD" (file-name-base (buffer-file-name)))
                (progn
                  (shell-command (concat "/path/to/buildifier/bazel-bin/buildifier/buildifier " (buffer-file-name)))
                  (find-alternate-file (buffer-file-name))))))

You could also set up a git hook to run this before committing, if that’s more your style. Regardless, give it a try! It’s a quick, easy way to make your BUILD files more readable.

Communicating between Bazel rules: how to use Skylark providers

Rules in Bazel often need information from their dependencies. My previous post touched on a special case of this: figuring out what a dependency’s runfiles are. However, Skylark is actually capable of passing arbitrary information between rules using a system known as providers.

Suppose we have a rule, analyze_flavors, that figures out what all of the flavors are in a dish. Our build file looks like:

load(":food.bzl", "analyze_flavors")

analyze_flavors(
    name = "burger",
    ingredients = [
        ":beef",
        ":ketchup",
    ],
)

analyze_flavors(
    name = "beef",
    tastes_like = "umame",
)

analyze_flavors(
    name = "ketchup",
    tastes_like = "sweet",
)

We want to build up a flavor profile for :burger, based on its ingredients.

To do this, food.bzl looks like:

def _flavor_impl(ctx):
  # Build up a flavor profile from this rule & its ingredients.
  flavor_profile = []
  for ingredient in ctx.attr.ingredients:
    if ingredient.flavor != None:
      flavor_profile += ingredient.flavor

  if ctx.attr.tastes_like != "":
    flavor_profile += [ctx.attr.tastes_like]

  # Write the list of flavors to a file.
  ctx.file_action(
    output = ctx.outputs.out,
    content = "%s tastes like %sn" % (
        ctx.label.name, " and ".join(flavor_profile))
  )

  # Return the list of flavors so it can be used by rules that depend on this.
  return struct(flavor = flavor_profile)

analyze_flavors = rule(
    attrs = {
        "ingredients": attr.label_list(),
        "tastes_like": attr.string(),
    },
    outputs = {"out": "flavors-of-%{name}"},
    implementation = _flavor_impl,
)

The highlighted lines are where the rule returns a provider, flavor, to be consumed by its reverse dependencies (the targets depending on it).

Our BUILD file gives us the following build graph:

graph

:burger depends on :beef and :ketchup. :beef and :ketchup each provide :burger with a flavor. Thus, if we build :burger and check its output file, we get:

$ bazel build :burger

INFO: Found 1 target...
Target //:burger up-to-date:
  bazel-bin/flavors-of-burger

INFO: Elapsed time: 0.270s, Critical Path: 0.00s
INFO: Build completed successfully, 2 total actions
$ cat bazel-bin/flavors-of-burger
burger tastes like umame and sweet

This can be used to communicate rich information from rule-to-rule in Skylark. See the Skylark cookbook for another example of providers.

Collecting transitive runfiles with skylark

Bazel has a concept it calls runfiles for files that a binary uses during execution. For example, a binary might need to read in a CSV, an ssh key, or a .json file. These files are generally specified separately from your sources for a couple of reasons:

  • Bazel can understand that it is a runtime, not compile dependency (so if the runfiel changes, the binary does not need to be rebuilt).
  • The type is less restrictive: most rules have restrictions on what its sources can “look like” (e.g., Java sources end in .java or .jar, Go sources end in .go, Python sources end in .py or .pyc, etc.).

Thus, these runfiles are often specified in a separate data attribute.

If you’re writing a skylark rule that combines several executables, you will probably want the skylark rule to also combine the runfiles for all of them. Let’s create a rule that can combine several executables and include all of their runfiles. As a toy example, I created a rule below that creates an executable. The rule has one attribute, data, that can be other files or rules. The executable, when run, will just list all of the runfiles it has available.

For example, suppose you had the following BUILD file:

list_runfiles(
    name = 'main-course', 
    data = ['lasagna.txt']
)

If we ran bazel run :main-course, it would print:

./my_workspace/main-course
./my_workspace/lasagna.txt
./MANIFEST

However, we can also provide list_runfiles targets as data. For example, our BUILD file could say:

list_runfiles(
    name = 'main-course', 
    data = [
        'lasagna.txt',
        ':side-dishes',
        ':drink',
    ]
)

list_runfiles(
    name = 'side-dishes', 
    data = [
        ':soup',
        ':salad',
    ]
)

list_runfiles(
    name = 'soup', 
    data = ['gazpacho.txt']
)

list_runfiles(
    name = 'salad', 
    data = ['waldorf.txt']
)

list_runfiles(
    name = 'drink', 
    data = ['milk.txt']
)

Then running bazel run :main-course will print:

.
./my_workspace
./my_workspace/side-dishes
./my_workspace/waldorf.txt
./my_workspace/salad
./my_workspace/milk.txt
./my_workspace/main-course
./my_workspace/lasagna.txt
./my_workspace/gazpacho.txt
./my_workspace/soup
./my_workspace/drink
./MANIFEST

:main_course has collected all of its transitive runfiles in its runfiles tree.

Here’s the list_runfiles rule definition:

def _list_runfiles_impl(ctx): 
  ctx.file_action(
    output = ctx.outputs.executable,
    content = 'n'.join([
        "#!/bin/bash",   
        "cd $0.runfiles",
        "find"
    ]),
    executable = True)
  return struct(runfiles = ctx.runfiles(collect_data = True))

list_runfiles = rule(
    attrs = {
        "data": attr.label_list(
            allow_files = True,
            cfg = DATA_CFG,
        ),
    },
    executable = True,
    implementation = _list_runfiles_impl,
)

The key is the line is runfiles = ctx.runfiles(collect_data = True). collect_data will automatically “harvest” the runfiles from data, srcs, and deps attributes. We’ve only defined data for this rule, so that’s what it will use.

Note that this won’t work if you change "data": attr.label_list( to something not covered by collect_data, e.g., "stuff": attr.label_list(. (Theoretically this should be covered by transitive_files, but I couldn’t actually get that working.)

Runfiles

Startup idea #6ec4e42a-28cc-4425-9ebc-61ac8e224580: Adventurer’s gear for geeky hikers

I’m going to start “calling” my startup ideas in the same way Andy Dwyer calls band names.

newbandname

So, first up: it’s like REI for D&D players.

We’d sell a “basic adventurer’s kit” that came with iron rations, wineskin, torches, 50 feet of rope, etc.

Then you could get “class specialization” kits, for example:

  • Rogue: contains lockpicks, a pack of cards, and invisible ink.
  • Wizard: parchment, ink, and a dozen small vials of reagents, orb.
  • Cleric: bandages, salves, holy symbol.

We could also offer Tolkien-esque maps of hiking areas and fancy medieval-looking bags/knives/hiking boots. See what carrying 40lbs of gear into the woods actually feels like! Then get it as a gift for a friend.

Gotta get the gear.
Gotta get the gear.

Using a generated header file as a dependecy

Someone asked me today about how to use a generated header as a C++ dependency in Bazel, so I figured I’d write up a quick example.

Create a BUILD file with a genrule that generates the header and a cc_library that wraps it, say, foo/BUILD:

genrule(
    name = "header-gen",
    outs = ["my-header.h"],
    # This command would probably actually call whatever tool was generat
    cmd = "echo 'int x();' > $@",
)

cc_library(
    name = "lib",
    hdrs = ["my-header.h"],
    srcs = ["lib.cc"],
    visibility = ["//visibility:public"]
)

Now you can depend on //foo:lib as you would a “normal” cc_library:

cc_binary(
    name = "bin",
    srcs = ["bin.cc"],
    deps = ["//foo:lib"],
)

And bin.cc would look like:

#include "foo/my-header.h"

// ...
int main() {
   x();  // Uses x defined in my-header.h.
}

Operation: Crappy Sewing Machine commences

This weekend, I went to a bra-making workshop and won a sewing machine in a raffle. It isn’t really crappy, but I spent a couple of hours un-jamming it, so I’m bitter.

The interesting thing about this machine is that it has a built-in camera, so you can take photos and video of exactly what you’re sewing and see them on the app. You can also buy new stitches from your phone and transfer them to your machine, so I started getting curious: what protocol is my sewing machine using? Could I write my own client?

I took a look at the manual, but there was nothing more technical than how to install the app in there. I checked the website, no other documentation there. I debated contacting customer service, but if I liked talking to people I wouldn’t be a programmer, so I fired up Wireshark and took a look at the network. I’m not too proficient with Wireshark, though, so I couldn’t figure out how to make it capture anything useful.

After a couple hours of fighting with it, I gave in and emailed customer support. I figured maybe they’d just forward me to a developer who would be happy to tell me about their protocol. Nope:

As far as the communication from the sewing machine to the app goes, I don’t know all the nuts and bolts but I do know it is proprietary information and is one of the features that makes the Spiegel 60609 so unique!

Bleh.

I realized that it would probably be easier for me to decompile the app, rather than sniffing the network, so I downloaded the APK using a sketchy service (I’m not sure if this is the best one out there, but it’s the least-offensive one I found) and undexed it using dex2jar:

$ chmod +x *.sh
$ ./d2j-dex2jar.sh path/to/Spiegel Social Sewing App_v1.0.4_apkpure.com.apk
dex2jar ../sewing-machine/Spiegel Social Sewing App_v1.0.4_apkpure.com.apk -> ./Spiegel Social Sewing App_v1.0.4_apkpure.com-dex2jar.jar

I opened it up in Intellij and boom, source code. Unfortunately, Intellij’s built-in decompiler choked on the one most interesting class (com.spiegel.android.spiegel.app.ui.settings.SpiegelMachineFacadeImpl). I tried to debug why (it could open every other .class file), but realized it would probably be easier to try another decompiler. I fired up JD-Gui and out popped the source!

Screen Shot 2016-04-12 at 9.13.51 PM

Turned out my sewing machine is running a PHP server, which is easy enough to communicate with. I think there are ~20 of these machines in the wild, so this is unlikely to be of any use to anyone, ever, but I look forward to writing my own client.

Here’s a video from it of it jamming the first time I attempted to use it:

configure: error: lib_i_don’t_care_about.so not found.

I work on Bazel, so I don’t usually get to see it from a user’s point of view. However, yesterday I added seven new projects to Bazel’s continuous integration, all of which promptly broke. I started cloning them and trying to fix them.

These projects were various user-contributed rules for Bazel: rules for building Go, .NET, AppEngine, and more. I had never built any code in most of these languages (Rust? Sass? D?) However, using Bazel, I could build and debug each of the projects without installing any prerequisites or copying libraries to certain places on my machine. Bazel handled downloading and configuring all of the libraries and tools that I needed. And now my patches are starting to get merged and the go rules, at least, are green! One down, six to go.