Created by: svet93
The issue is that today when a job stalls more than the allowed number of times, it gets moved to failed, but the event doesn't get published. We noticed that because we were running a job that would start other child jobs and use the .finished()
function in order to know when the work is done. However, if a job stalled, it never ended in the global:failed
handler so the finished
function never resolved.
If a job has stalled more than the allowable limit, fail and publish failure
fixes #2041 (closed)