aarch64 clean calls over-write whereami field, clobbering signals_pending field
This surfaces as an assert info->interrupted == NULL || info->interrupted == f
(seen also in #2328 (closed) and #4670 (closed) but with different root causes) due to signals_pending suddenly becoming 0 and DR failing to relink a fragment (and failing to deliver a signal):
disp signals_pending = 0
d_r_dispatch: target = 0x0000ffffbb1d29bc
Entry into F100338(0x0000ffffbb1d29bc).0x0000aaaa8ce6d740 (shared)
master_signal_handler: thread=976210, sig=26, xsp=0x0000fffd72942da0, retaddr=0x000000000000001a
received alarm 1 @0x0000aaaa8a75bfdc
dr value is now 0
master_signal_handler 26 returning now to 0x0000aaaa8a75bfdc
master_signal_handler: thread=976210, sig=27, xsp=0x0000fffd72942da0, retaddr=0x000000000000001b
received alarm 2 @0x0000aaaa8a75bfdc
app value is now 0
record_pending_signal(27) from gen routine or stub 0x0000aaaa8a75bfdc
unlinking outgoing for interrupted F2430
action is not SIG_IGN
3rd pending alarm 27 => dropping 2nd
signals_pending = 1
master_signal_handler 27 returning now to 0x0000aaaa8a75bfdc
master_signal_handler: thread=976210, sig=27, xsp=0x0000fffd72942da0, retaddr=0x000000000000001b
received alarm 2 @0x0000aaaad3de3388
app value is now 0
record_pending_signal(27) from DR at pc 0x0000aaaad3de3388
action is not SIG_IGN
3rd pending alarm 27 => dropping 2nd
signals_pending = 1
master_signal_handler 27 returning now to 0x0000aaaad3de3388
master_signal_handler: thread=976210, sig=26, xsp=0x0000fffd72942da0, retaddr=0x000000000000001a
received alarm 1 @0x0000aaaad39322e4
dr value is now 0
master_signal_handler 26 returning now to 0x0000aaaad39322e4
Exit from F100337(0x0000ffffbb1d29a4).0x0000aaaa8ce6d708 (shared)
(block ends with syscall)
Entry into do_syscall to execute a non-ignorable system call
system call 63
Exit from system call
post syscall: sysnum=0x000000000000003f, result=0x0000000000001000 (4096)
disp signals_pending = 0
d_r_dispatch: target = 0x0000ffffbb1d29bc
Entry into F100338(0x0000ffffbb1d29bc).0x0000aaaa8ce6d740 (shared)
Exit from F100337(0x0000ffffbb1d29a4).0x0000aaaa8ce6d708 (shared)
(block ends with syscall)
Entry into do_syscall to execute a non-ignorable system call
system call 63
Exit from system call
post syscall: sysnum=0x000000000000003f, result=0x0000000000001000 (4096)
disp signals_pending = 0
d_r_dispatch: target = 0x0000ffffbb1d29bc
Entry into F100338(0x0000ffffbb1d29bc).0x0000aaaa8ce6d740 (shared)
master_signal_handler: thread=976210, sig=27, xsp=0x0000fffd72942da0, retaddr=0x000000000000001b
received alarm 2 @0x0000aaaa8b1705cc
app value is now 0
record_pending_signal(27) from cache pc 0x0000aaaa8b1705cc
delaying until exit F17676
SYSLOG_ERROR: Application xxx (zzz). Internal Error: DynamoRIO debug check failure: core/unix/signal.c:4367 info->interrupted == NULL || info->interrupted == f
(Error occurred @100970 frags)
Adding padding before the dcontext.signals_pending
field makes the assert disappear. It looks like clean call code writes 64-bit registers onto the 32-bit dcontext.whereami
field, overflowing and clobbering signals_pending
.