Translation failures and asserts from client calling dr_app_pc_from_cache_pc on bb prefixes and rseq mangling
We have a client who has an itimer which calls dr_app_pc_from_cache_pc() on the PC on each sample. This results many instances of a curiosity, and some of an assert:
<CURIOSITY : tdcontext != get_thread_private_dcontext() || (((OPTION_IS_INTERNAL_stress_recreate_pc)) ? ((((void)(((dynamo_options.checklevel >= (1)) && !(!((OPTION_IS_STRING_stress_recreate_pc)) || (((&options_lock)->num_readers > 0) || self_owns_write_lock(&options_lock)))) ? (d_r_internal_error("core/translate.c", 907, "!((OPTION_IS_STRING_stress_recreate_pc)) || READWRITE_LOCK_HELD(&options_lock)"), 0) : 0)), dynamo_options.stress_recreate_pc)) : ((((dynamo_options.checklevel >= (1)) && !((0))) ? (d_r_internal_error("non-internal option argument " "stress_
Internal Error: DynamoRIO debug check failure: core/translate.c:991 tdcontext != get_thread_private_dcontext() || INTERNAL_OPTION(stress_recreate_pc) || TEST(FRAG_SELFMOD_SANDBOXED, flags) || TEST(FRAG_WAS_DELETED, flags)
Investigating we see that the translation points are always either in bb prefixes (we're running -disable_traces, but on arm every bb has a prefix) or in rseq mangling code.
Action items:
- Relax these asserts for client requests: these aren't synchronous faults.
- Ideally, add full xl8 support for all parts of rseq mangling and for prefixes. It improves performance (fewer synchall retries) and helps things like these client samples.
- Print out DR's base at the top of these assert message callstacks so we can symbolize them more easily.