When a build is canceled, a SIGINT is sent to Buck's entire process tree. This kills any persistent workers. Unfortunately, this also hung the WorkerShellStep
s for the next build.
This PR fixes the particular problem, but not the core issue of persistent worker death. I'm not sure how to tackle the core issue without either:
- Adding a new message type to the protocol. (i.e. "ping")
- Adding some amount of retries with persistent workers per step (that is >= the number of workers in the pool).
This PR solves the particular issue above (SIGINT & clean death) by killing workers that get interrupted in the middle of a build step. It also improves logging so worker tool debugging is a bit easier.