Eliminates the call-return reliance for the native execution step of rseq support. Makes a local copy of the sequence right inside the sequence-ending block and executes it. The sequence is inserted as additional instructions and is then mangled normally (mangling changes are assumed to be restartable), but it is not passed to clients. Any exits are regular block exits, resulting in a block with many exits.
The prior call-return scheme is left under a temporary option -rseq_assume_call, as a failsafe in case there are stability problems discovered with this native execution implementation. Once we are happy with the new scheme we can remove the option.
To make the local copy an rseq region, the per-thread rseq_cs address is identified by watching system calls. For attach, it is identified by searching the possible static TLS offsets. The assumption of a constant offset is documented and verified.
The rseq_cs's abort handler is a new exit added with the app's signature as data just before it, hidden in the operands of a nop instruction to avoid problems with decoding the fragment. A local jump skips over the data and exit.
A new rseq_cs structure is allocated for each sequence-ending fragment. It is stored in a hashtable in the rseq module, to avoid complexities and overhead of adding an additional fragment_t or "subclass" field. A new flag is set to trigger calling into the rseq module on fragment deletion.
The rseq_cs fields are filled in via a new post-emit control point, using information stored in labels during mangling. The pointer to the rseq_cs is inserted with a dummy value and patched in this new control point using a new utility routine patch_mov_immed_ptrsz().
To avoid crashing due to invalid rseq bounds after freeing the rseq_cs structure, the rseq pointer is cleared explicitly on completion, and on midpoint exit by the fragment deletion hook along with a hook on the shared fragment flushtime update, to ensure all threads are covered.
The rseq test is augmented and expanded. An invalid instruction is added to properly test the abort handler, under a conditional to allow testing each sequence both to completion and on abort.
Future work is properly handling a midpoint exit during the instrumentation execution: we need to invoke the native version as well.
Adding aarchxx support is also future work: the patch_mov_immed_ptrsz(), the writes to the rseq struct in TLS, and the rseq tests are currently x86-only.
Issue: #2350 (closed)