This builds on the first part of the tutorial. In this post, we will make the the rule actually produce an executable.
Capturing the output from scalac
At the end of the tutorial last time, we were calling scalac, but ignoring the result:
(cd /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg && exec env - /bin/bash -c 'external/scala/bin/scalac HelloWorld.scala; echo '''blah''' > bazel-out/local_darwin-fastbuild/bin/hello-world.sh')
If you look at the directory where the action is running (/private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg in my case) you can see that HelloWorld.class and HelloWorld$.class is created. This directory is called the execution root, it is where bazel executes build actions. Bazel uses separate directory trees for source code, executing build actions, and output files (bazel-out/). Files won’t get moved from the execution root to the output tree unless we tell Bazel we want them.
We want our compiled scala program to end up in bazel-out/, but there’s a small complication. With languages like Java (and Scala), a single source file might contain inner classes that cause multiple .class files to be generated by a single compile action. Bazel cannot know until it runs the action how many class files are going to be generated. However, Bazel requires that each action declare, in advance, what its outputs will be. The way to get around this is to package up the .class files and make the resulting archive the build output.
In this example, we’ll add the .class files into a .jar. Let’s add that to the outputs, which should now look like this:
outputs = { 'jar': "%{name}.jar", 'sh': "%{name}.sh", },
In the impl
function, our command is getting a bit complicated so I’m going to change it to an array of commands and then join them on “n” in the action:
def impl(ctx): cmd = [ "%s %s" % (ctx.file._scalac.path, ctx.file.src.path), "find . -name '*.class' -print > classes.list", "jar cf %s @classes.list" % (ctx.outputs.jar.path), ] ctx.action( inputs = [ctx.file.src], command = "n".join(cmd), outputs = [ctx.outputs.jar] )
This will compile the src, find all of the .class files, and add them to the output jar. If we run this, we get:
$ bazel build -s :hello-world INFO: Found 1 target... >>>>> # //:hello-world [action 'Unknown hello-world.jar'] (cd /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg && exec env - /bin/bash -c 'external/scala/bin/scalac HelloWorld.scala find . -name '''*.class''' -print > classes.list jar cf bazel-out/local_darwin-fastbuild/bin/hello-world.jar @classes.list') Target //:hello-world up-to-date: bazel-bin/hello-world.jar INFO: Elapsed time: 4.774s, Critical Path: 4.06s
Let’s take a look at what hello-world.jar contains:
$ jar tf bazel-bin/hello-world.jar META-INF/ META-INF/MANIFEST.MF HelloWorld$.class HelloWorld.class
Looks good! However, we cannot actually run this jar, because java doesn’t know what the main class should be:
$ java -jar bazel-bin/hello-world.jar no main manifest attribute, in bazel-bin/hello-world.jar
Similar to the java_binary
rule, let’s add a main_class
attribute to scala_binary
and put it in the jar’s manifest. Add 'main_class' : attr.string(),
to scala_binary
‘s attrs
and change cmd
to the following:
cmd = [ "%s %s" % (ctx.file._scalac.path, ctx.file.src.path), "echo Manifest-Version: 1.0 > MANIFEST.MF", "echo Main-Class: %s >> MANIFEST.MF" % ctx.attr.main_class, "find . -name '*.class' -print > classes.list", "jar cfm %s MANIFEST.MF @classes.list" % (ctx.outputs.jar.path), ]
Remember to update your actual BUILD file to add a main_class attribute:
# BUILD load("/scala", "scala_binary") scala_binary( name = "hello-world", src = "HelloWorld.scala", main_class = "HelloWorld", )
Now building and running gives you:
$ bazel build :hello-world INFO: Found 1 target... Target //:hello-world up-to-date: bazel-bin/hello-world.jar INFO: Elapsed time: 4.663s, Critical Path: 4.05s $ java -jar bazel-bin/hello-world.jar Exception in thread "main" java.lang.NoClassDefFoundError: scala/Predef$ at HelloWorld$.main(HelloWorld.scala:4) at HelloWorld.main(HelloWorld.scala) Caused by: java.lang.ClassNotFoundException: scala.Predef$ at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 2 more
Closer! Now it cannot find some scala libraries it needs. You can add it manually on the command line to see that our jar does actually does work if we specify the scala library jar, too:
$ java -cp $(bazel info output_base)/external/scala/lib/scala-library.jar:bazel-bin/hello-world.jar HelloWorld Hello, world!
So we need our rule to generate an executable that basically runs this command, which can be accomplished by adding another action to our build. First we’ll add a dependency on scala-library.jar by adding it as a hidden attribute:
'_scala_lib': attr.label( default=Label("@scala//:lib/scala-library.jar"), allow_files=True, single_file=True),
Making scala_binary
s executable
Let’s pause here for a moment and switch gears: we’re going to tell bazel that scala_binary
s are binaries. To do this, we add executable = True
to the attrs and get rid of the reference to hello-world.sh in the outputs:
... outputs = { 'jar': "%{name}.jar", }, implementation = impl, executable = True, )
This says that scala_binary(name = "foo", ...)
should have an action that creates a binary called foo
, which can be referenced via ctx.outputs.executable
in the implementation function. We can now use bazel run :hello-world
(instead of bazel build :hello-world; ./bazel-bin/hello-world.sh
).
The executable we want to create is the java command from above, so we add the second action to impl
, this one a file action (since we’re just generating a file with certain content, not executing a series of commands to generate a .jar):
cp = "%s:%s" % (ctx.outputs.jar.basename, ctx.file._scala_lib.path) content = [ "#!/bin/bash", "echo Running from $PWD", "java -cp %s %s" % (cp, ctx.attr.main_class), ] ctx.file_action( content = "n".join(content), output = ctx.outputs.executable, )
Note that I also added a line to the file to echo where it is being run from. If we now use bazel run
, you’ll see:
$ bazel run :hello-world INFO: Found 1 target... Target //:hello-world up-to-date: bazel-bin/hello-world.jar bazel-bin/hello-world INFO: Elapsed time: 2.694s, Critical Path: 0.08s INFO: Running command line: bazel-bin/hello-world Running from /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg/bazel-out/local_darwin-fastbuild/bin/hello-world.runfiles Error: Could not find or load main class HelloWorld ERROR: Non-zero return code '1' from command: Process exited with status 1.
Whoops, it’s not able to find the jars! And what is that path, hello-world.runfiles, it’s running the binary from?
The runfiles directory
bazel run
runs the binary from the runfiles directory, a directory that is different than the source root, execution root, and output tree mentioned above. The runfiles directory should contain all of the resources needed by the executable during execution. Note that this is not the execution root, which is used during the bazel build
step. When you actually execute something created by bazel, its resources need to be in the runfiles directory.
In this case, our executable needs to access hello-world.jar and scala-library.jar. To add these files, the API is somewhat strange. You must return a struct containing a runfiles object from the rule implementation. Thus, add the following as the last line of your impl
function:
return struct(runfiles = ctx.runfiles(files = [ctx.outputs.jar, ctx.file._scala_lib]))
Now if you run it again, it’ll print:
$ bazel run :hello-world INFO: Found 1 target... Target //:hello-world up-to-date: bazel-bin/hello-world.jar bazel-bin/hello-world INFO: Elapsed time: 0.416s, Critical Path: 0.00s INFO: Running command line: bazel-bin/hello-world Running from /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg/bazel-out/local_darwin-fastbuild/bin/hello-world.runfiles Hello, world!
Hooray!
However! If we run it as bazel-bin/hello-world, it won’t be able to find the jars (because we’re not in the runfiles directory). To find the runfiles directory regardless of where the binary is run from, change your content
variable to the following:
content = [ "#!/bin/bash", "case "$0" in", "/*) self="$0" ;;", "*) self="$PWD/$0";;", "esac", "(cd $self.runfiles; java -cp %s %s)" % (cp, ctx.attr.main_class), ]
This way, if it’s run from bazel run
, $0
will be the absolute path to the binary (in my case, /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg/bazel-out/local_darwin-fastbuild/bin/hello-world). If it’s run via bazel-bin/hello-world, $0
will be just that: bazel-bin/hello-world. Either way, we’ll end up in the runfiles directory before executing the command.
Now our rule is successfully generating a binary. You can see the full code for this example on GitHub.
In the final part of this tutorial, we’ll fix the remaining issues:
- No support for multiple source files, never mind dependencies.
[action 'Unknown hello-world.jar']
is pretty ugly.
Until next time!
This has been great – the need for writing or understanding other developers’ custom rules while still trying to grok the Bazel Way is a big drag. I have been following along using Bazel 0.6.1 on OSX. I figured out how to get part 1 almost working using the cfg=”host”. My current issue is the build cannot find the scalac compiler. I know this by looking at the output from the build:
cat /private/var/tmp/_bazel_johnferguson/0516c5fa4e8865dd38d08261954254c9/execroot/__main__/bazel-out/_tmp/action_outs/stderr-1
Which shows me:
/bin/bash: external/scala/bin/scalac: No such file or directory
Which makes sense because when ls at external/scala I do not see bin
BUILD.bazel WORKSPACE scala-2.11.7 scala-2.11.7.tgz
So in the WORKSPACE I tried strip_prefix=”scala-2.11.7″ but the stderr output for the build still says:
/bin/bash: external/scala/bin/scalac: No such file or directory
But if I do this from my project directory:
/private/var/tmp/_bazel_johnferguson/0516c5fa4e8865dd38d08261954254c9/execroot/__main__/external/scala/bin/scalac
I see scalac help listing. So it is there…. any suggestions on how to debug this part and get it compiling?
LikeLike
Glad this is helpful, but I’ve left Google now and I am no longer working on Bazel. I’m not sure what’s going wrong, but if you’d like some help, I recommend asking on StackOverflow or the Bazel mailing list.
LikeLike