When a signal occurs inside an rseq region, DR now provides the actual interruption PC to the kernel xfer event (but not the signal event, to match kernel behavior). This is required to accurately place the signal in a drmemtrace offline trace.
What this looks like: Before: 0x00007fe36b1ea8f1 74 02 jz $0x00007fe36b1ea8f5 0x00007fe36b1ea8f3 0f 0b ud2a 0x00007fe36b1ea8f5 83 05 74 38 00 00 01 addl $0x01, 0x00007fe36b1ee170 <marker: kernel xfer to handler> <marker: timestamp 13224196441528495> <marker: tid 74400 on core 0> 0x00007fe36b1ea85c 55 push %rbp After: 0x00007f15e71d78f1 74 02 jz $0x00007f15e71d78f5 0x00007f15e71d78f3 0f 0b ud2a <marker: kernel xfer to handler> <marker: timestamp 13224196672834337> <marker: tid 78727 on core 1> 0x00007f15e71d785c 55 push %rbp
Adds a trace_invariants check for a signal immediately after UD2A. I first went through and added module_mapper and decoding to trace_invariants: but then we have to link in static DR (even if we do our own raw decoding as module_mapper uses DR) and thus we can't run the test anymore on AArch64 (#2007 (closed)) or Mac. Instead, I went with a simpler solution of putting the annotations that signal_invariants.c uses into rseq.c.
I had to fix 2 bugs to get the marker in the right place and to get this invariants check to fire:
-
raw2trace was adding to cur_modoffs before memrefs and then having an extra check for where to put the the signal, which doesn't make sense. It ended up putting the marker too early. After fixing, the other tests still pass, so it is not clear why I had it that way.
-
op_offline was off inside trace_invariants, so the assert on a signal after ud2a was not firing. I now have analyzer_multi setting op_offline up front as a synthetic option for post-processing usage.
I also added a direct jump as an annotation as the start of the abort handler in rseq.c to avoid trace_invariants complaining about a signal not returning to the interruption point.
I added a drcachesim offline regression test tracing the rseq test's code. This same test will be separately used for #4019 (closed).
Issue: #4019 (closed), #4041 (closed) Fixes #4041 (closed)