Transcript:
[MUSIC PLAYING] LARS CLAUSEN: Hello, and thank you for joining us this talk on persistent workers in Bazel. My name is Lars. I’m a Googler in the Munich office. SUSAN STEINMAN: I’m Susan. I am a Googler in the New York office. LARS CLAUSEN: And we want to tell you how to get the best performance out of Bazel’s persistent workers. First, we’re going to give an introduction to the worker concept– what’s the problem they were made to solve, and how did they solve it? Then we’re going to go through how do you tune the workers for best performance. Then, for rule maintainers, we’re going to tell you how do you create a worker. And then we’re going to go on some advanced topics and upcoming work. What’s it all about? Performance. This slide shows the time it takes to do a clean build of Bazel itself on a 6-core Linux machine. On the left, the regular non-worker build, and on the right, with workers. And as you can see, with workers, your build is about three times as fast. And you can also see that sandboxing isn’t actually very expensive. And this is a clean build. For incremental builds, you can get even better performance. You can get up to six times speed up, we’ve seen in some of our incremental benchmarks. So let’s take a look at what’s behind the speed-up. The basic problem is that Bazel runs many actions. Every single step of a build, it’s a separate action. And without workers, each action would run at a separate process in the operating system. Which means you can’t do any kind of internal cache– or, for instance, ASTs– but more critically, especially for JVM-based tools, you are not able to benefit any from your Just-In-Time, and you’re going to have your startup costs over and over again. This includes creating a JVM, reading in classes, initializing, et cetera. All that takes time that’s wasted. So workers solve this problem by running process for a long time. And each process can handle multiple pieces of work. Bazel, the server, manages a lifetime of workers, the creation and the shutdown, and communicate via stdin/stdout with the workers. But the workers write the artifacts directly to disk. This is often just a thin wrapper around an existing tool. For instance, for Javac, the wrapper calls into the Java compilation API. And these workers are on by default for any mnemonic that implements them, because the “worker” strategy is the default before local. This is useful for any kind of runtime, but particularly the JVM benefits a lot from it. So we’re going to be focusing on some of the things that affect the JVM. So here’s a little schematic view of what happens during a build. You do your build from the command line. It talks with the RPC, the Bazel server. And that, then, talks to the various workers. In this case, there are two workers, for Javac, and one for Go. And they just sit around and do their work. Now you may say having to separate workers for Javac is again a waste. And that is correct. We should be using threads, or goroutines, or whatever our system has. And recently, an external contribution added that. We are very grateful for this community contribution. Basically, this is an extension to the worker protocol that allows parallelized works in a single worker. This, when implemented, it’s automatically used. You can turn it off if you run into problems with the [? test has no work ?] or multiplex flag. So this allows having a single cache for all your various builds. But more importantly, it saves memory overhead, especially for having multiple JVMs. We have a few stability problems that we are currently looking into. But we already, at Google, using [INAUDIBLE] internally in some teams. SUSAN STEINMAN: There are a few flags that help you tune the behavior of workers to get better performance. And the most important one of those flags is worker_max_instances, which sets the number of workers. It can take a raw number to set the default, and it can also take “mnemonic equals value” to set the number of workers for a particular action type. The default is four, but you almost certainly want something less than that. And the reason for that is that we count the number of workers by worker key. And the worker key depends on a lot of different factors– arguments, the environment, the executable. And the number of possible worker keys is potentially unbounded. So having a small multiplier can keep the resource usage under control. At Google, we turn workers off by default. And then teams configure the use of certain workers to meet their needs. That’s something that you might want to do, or you might want to reduce the default number of workers to one, and raise the number of certain mnemonics that you care about. That’ll be up to you. And the right number of workers varies a lot, depending on what your build looks like, and the amount of parallelism you have, what your machine is like. So once you’ve done the research to determine the right number of workers for which mnemonics for you, you can set those values with this flag. And they can also be set relative to machine resources. So you can give them a number, but you can also give them a multiplier of the host CPUs, or the host RAM. As you can see from this graph, it’s important to get the number of workers right. Workers can use more memory– more workers can use more memory. This is a build of Bazel, as you saw in the previous graph. And the reason there aren’t any more red bars is that the machine ran out of memory and couldn’t continue the build with more workers than that. Also the blue bars– which are the multiplex workers– you don’t see that problem for. But if you do want to tune the number of multiplex workers, you can set them with experimental worker max multiplex instances, which is quite a mouthful. A few other flags that affect the behavior of workers are here. Lars will tell you a little bit more about what sandboxing is, but you can toggle the behavior of that with the worker_sandboxing flag. High_priority_workers takes a list of mnemonics that are high priority, maybe because those actions are on the critical path. And Bazel will prioritize those actions while throttling actions of other types. Worker_extra_flag allows you to pass an arbitrary list of flags to all of your workers. Worker_quit_after_build is what you would expect. And I’m about to tell you a little bit more about creating workers. This last flag pertains more to that part. Experimental worker allow json protocol tells Bazel to allow workers to communicate with it using JSON, instead of requiring them to communicate with Proto. So it doesn’t mean that workers have to speak JSON, but it means that if a worker speaks JSON, Bazel will tolerate that. So how do you create a worker? What a worker is is actually very simple. It’s a binary. It accepts the persistent_worker flag. It reads WorkRequests from stdin. It does some stuff, writing any artifacts that it creates directly to the file system. And then it writes a WorkResponse to stdout. Notably, it shouldn’t do anything other than that in the stdout. It only reads WorkRequests and writes WorkResponses. It can write errors to stderr, but it shouldn’t write them to stdout. What is a WorkRequest? It’s a structure. It has some arguments to the worker. That’s a list. It also has a list of path/digest pairs, which represent the input files that the worker is allowed to access. They’re not actually restrictive. They’re usually just used for cache verification. The worker has access to the files that it needs to use. And it also contains a request ID, which is 0 if the worker is not a multiplex worker. So the worker interprets the request, does what it needs to do, writes what it needs to write to the file system, and then it returns a work response. And a WorkResponse contains an exit code– just 0 versus non-0. It contains the same request ID from the WorkRequest. And then it also has this output field. So as you might remember, the workers shouldn’t be writing anything to stdout other than this work response. So you might have any error messages created put into this output field so that Bazel can parse them. I’ll talk a little bit more about where those go in a minute. But first, that’s the worker part. There will also be a rule that uses that worker. So you’ve created this binary that follows those rules. You’ll have a rule that defines that, maybe, Java binary. And then you’ll also have a rule– this is an example in Starlark– that refers to that worker that you’ve created. You can imagine that there’s another field, maybe called “worker,” that has a label that pertains to the worker binary that you created. And then it also needs these two other sections. So it also contains a list of arguments– which is a list of strings that get passed to the worker– and then also this “@flagfile” argument. And that is the location of a file that Bazel creates that contains the additional arguments to that worker. And then it also contains execution requirements, which is a key value store that must contain “supports-workers” equals 1, and may also contain “requires-worker-protocol” equals JSON or Proto. So it’ll default to Proto if you don’t list “requires-worker_protocol,” but if your worker does communicate using JSON, it needs to have this field. And it also needs to have that experimental worker allow json protocol flag passed to the build. In terms of debugging workers, there are two main places where you can expect to find the logs. So one is in the Bazel Java logs, and the other is in the worker output files. The Bazel log includes a few things. It’ll contain Bazel’s best guess about what happened– so whether there was no response, or whether the response wasn’t formatted correctly. It will contain the output field of the WorkResponse that you saw earlier. And it will also contain a path to the log file, which has the stderr of the worker process. And Bazel will print out all of these if the exit code isn’t 0– so if there was an issue. But if you want to see the paths to that log file in cases where there is a 0 exit code, you can pass worker_verbose to your build, and Bazel will print out the location of that log file. And because the worker process is just like any other program, you can use whatever tools you’d use to debug another program to debug the worker process as well. And so I’ll turn it over to Lars, who will tell you a bit about sandboxing and dynamic execution– some other topics. LARS CLAUSEN: OK. So that was how you use workers, and how you create workers. So let’s get into some advanced topics. First of all– sandboxing. As I mentioned, this is a very important concept in Bazel. We want to be able to have hermetic builds, and sandboxing is a main part of that. Now, all of the tools involved in a build have always been cooperative in their sandboxing. They could write to files in other places, and then its second action could read that. But we don’t want to do that, because that would break the hermeticity. When we have workers, it gets a little more complicated. Because now you have long processes that might leave something in memory, might use a common temp file area, or might put its cache in a certain place. Or it might just expect that once an action has been done, the process would exit, and so not do a proper cleanup. The workers need to handle the cleanup properly. Now, in multiplex workers, it gets even worse. Because they might have temp files that then the different threads could happen to find. And it has to be thread safe. So there’s even more verification that needs to be done to make sure that a multiplex worker is actually properly hermetic. Unfortunately, so far, multiplex workers are implicitly not fully hermetic, because they all write into the same directory. They only have one output directory per process. And now we have many workers’ threads writing into their territory. They are not sandboxed yet. We are working on how to fix that. Basically, we want the request to be able to specify where to put the results. There is also a potential for output races, in particular, when we use the new dynamic scheduling system. This is a system for remote execution frameworks, which are becoming more and more useful, where you speculative execute an action both locally and remotely. And whichever comes first wins. Since a remote execution framework can be great for massive caching and massive parallelism, this is fantastic when you want to do a clean build. It’s not so great on latency for incremental builds, so there the local would be best. Now, the output race is that if the remote execution finishes first, we’ll have to prevent the workers from overriding the result files. Because those workers are still running. And we have looked at various ways to add locks to prevent this race, but they cost performance. What we need to have for this to work well is a way to cancel our worker, and a way to sandbox it properly. Now, canceling is not something that the current worker protocol allows doing. We want to be able to contact a specific worker, or worker thread, and say stop working. This would make this whole lock problem much, much faster– much, much simpler. And it would allow for faster reuse of the worker itself. It can go right ahead and start on the next WorkRequest. It will also fix the slight annoyance that if you interrupt a build, the workers will actually finish their work, which could take a lot of time and CPU. Problem is, cancellation is something that we all need to add to the WorkRequest. And the workers need to figure out how to actually support this correctly. These are the next things we are going to be working on. We want to improve the multiplex stability. We want to add sandboxing. We want to be able to cancel workers. But we also want to be able to cap resource usage by workers. Currently, Bazel doesn’t know how many resources CPU– and memory in particular– the workers are using. We need to feed that information back, so that Bazel can balance having the fastest build possible with a box that is still usable in the meantime. So these are all things we’re going to be working on in the coming months. Thank you for listening in. And we hope this will help you build faster with Bazel, and make all your builds both fast and correct. [MUSIC PLAYING]