Collecting transitive runfiles with skylark

Bazel has a concept it calls runfiles for files that a binary uses during execution. For example, a binary might need to read in a CSV, an ssh key, or a .json file. These files are generally specified separately from your sources for a couple of reasons:

  • Bazel can understand that it is a runtime, not compile dependency (so if the runfiel changes, the binary does not need to be rebuilt).
  • The type is less restrictive: most rules have restrictions on what its sources can “look like” (e.g., Java sources end in .java or .jar, Go sources end in .go, Python sources end in .py or .pyc, etc.).

Thus, these runfiles are often specified in a separate data attribute.

If you’re writing a skylark rule that combines several executables, you will probably want the skylark rule to also combine the runfiles for all of them. Let’s create a rule that can combine several executables and include all of their runfiles. As a toy example, I created a rule below that creates an executable. The rule has one attribute, data, that can be other files or rules. The executable, when run, will just list all of the runfiles it has available.

For example, suppose you had the following BUILD file:

list_runfiles(
    name = 'main-course', 
    data = ['lasagna.txt']
)

If we ran bazel run :main-course, it would print:

./my_workspace/main-course
./my_workspace/lasagna.txt
./MANIFEST

However, we can also provide list_runfiles targets as data. For example, our BUILD file could say:

list_runfiles(
    name = 'main-course', 
    data = [
        'lasagna.txt',
        ':side-dishes',
        ':drink',
    ]
)

list_runfiles(
    name = 'side-dishes', 
    data = [
        ':soup',
        ':salad',
    ]
)

list_runfiles(
    name = 'soup', 
    data = ['gazpacho.txt']
)

list_runfiles(
    name = 'salad', 
    data = ['waldorf.txt']
)

list_runfiles(
    name = 'drink', 
    data = ['milk.txt']
)

Then running bazel run :main-course will print:

.
./my_workspace
./my_workspace/side-dishes
./my_workspace/waldorf.txt
./my_workspace/salad
./my_workspace/milk.txt
./my_workspace/main-course
./my_workspace/lasagna.txt
./my_workspace/gazpacho.txt
./my_workspace/soup
./my_workspace/drink
./MANIFEST

:main_course has collected all of its transitive runfiles in its runfiles tree.

Here’s the list_runfiles rule definition:

def _list_runfiles_impl(ctx): 
  ctx.file_action(
    output = ctx.outputs.executable,
    content = 'n'.join([
        "#!/bin/bash",   
        "cd $0.runfiles",
        "find"
    ]),
    executable = True)
  return struct(runfiles = ctx.runfiles(collect_data = True))

list_runfiles = rule(
    attrs = {
        "data": attr.label_list(
            allow_files = True,
            cfg = DATA_CFG,
        ),
    },
    executable = True,
    implementation = _list_runfiles_impl,
)

The key is the line is runfiles = ctx.runfiles(collect_data = True). collect_data will automatically “harvest” the runfiles from data, srcs, and deps attributes. We’ve only defined data for this rule, so that’s what it will use.

Note that this won’t work if you change "data": attr.label_list( to something not covered by collect_data, e.g., "stuff": attr.label_list(. (Theoretically this should be covered by transitive_files, but I couldn’t actually get that working.)

Runfiles

One thought on “Collecting transitive runfiles with skylark

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: