Adds W^X-aware handling of several generated code cases missed in the original implementation:
- mangle_syscall_code()
- shift_ctis_in_fragment()
Adds best-effort handling of fork (there is still a race, and potentially noticeable overhead).
Adds a fix for the proper heap type when extending reachable heap units (which showed up as a trace encoding bug).
Adds a usage error when using dr_nonheap_alloc() with +wx memory, which we do not support with W^X.
Changes the single -satisfy_w_xor_x test into a cross-cutting option set, expanding into multiple tests to cover more behavior. To include drcachesim tests here, a new test feature _self_serial is added which adds dependences for copies of the same test run under different options.
Issue: #3556