[APP CRASH, AArch64] aflags not properly restored using drreg API due to incomplete decoder
Use of drreg's API ( drreg_reserve_aflags()
and drreg_unreserve_aflags()
) to save and restore aflags on AArch64 clobbered the aflags for an application. The control flow under DR was different than the control flow on a native run leading to an unexpected application-level assertion. The bug exhibited after a couple of millions of instructions and DR was run with these arguments: -unsafe_build_ldstex -disable_traces -vm_size 2G -no_enable_reset -t drcachesim -offline -trace_after_instrs 3000G -exit_after_tracing 20G
.
As an optimization, drreg performs restoration of aflags as late as possible to minimize spills/fills. This optimization works fine for all the recognized flags-sensitive (read/write flags) instructions in the application that I checked. However, for AArch64 the decoder is incomplete and there are unrecognized instructions decoded as instances of a generic instruction, OP_xx. It seems that the placement of the aflags restoration by drreg does not take into account these unrecognized instructions and there are multiple cases like the following:
m4 ... subs ...
L3 ... xx ...
m4 ... msr %x1 -> %nzcv
In this example, DR clobbers the aflags (with subs
), then an unrecognized instruction of the application (xx
) that may read/write aflags operates on the clobbered aflags, and then the flags are overwritten (msr
) nullifying any effect of the unrecognized instruction on the flags. Instead, the flags should have been restored prior to the xx
instruction.