CRASH in drcachesim release build
All drcachesim release build tests are failing today. They did not fail yesterday. The 40 failing tests: http://dynamorio.org/CDash/viewTest.php?onlyfailed&buildid=41954
They are crashing. I went on the CDash machine and got a callstack of one. It's during init, but given that some tests are listed as taking > 1 minute I don't know if they all crashed the same way:
$ bin64/drrun -t drcachesim -offline -- suite/tests/bin/simple_app
<Application /work/dr/nightly/run/build_release-external-64/suite/tests/bin/simple_app (29123). DynamoRIO Cache Simulator Tracer internal crash at PC 0x0000000072015e1b. Please report this at http://dynamorio.org/issues. Program aborted.
Received SIGSEGV at client library pc 0x0000000072015e1b in thread 29123
Base: 0x00007f813a97f000
Registers:eax=0x000000007222ba78 ebx=0x0000000056022fe0 ecx=0x0000000000000000 edx=0x0000000000000000
esi=0x0000000056022fe0 edi=0x00000000400ea5d0 esp=0x00007ffe56022980 ebp=0x0000000000009fff
r8 =0x0000000000000000 r9 =0x0000000000000001 r10=0x0000000000000001 r11=0x0000000040059e00
r12=0x0000000000007fff r13=0x00000000400ea5d0 r14=0x0000000000000002 r15=0x0000000000000000
eflags=0x0000000000010206
version 7.1.17980, custom build
-no_dynamic_options -client_lib '/work/dr/nightly/run/build_release-external-64/bin64/../clients/lib64/release/libdrmemtrace.so;0;-offline' -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -early_inject -emulate_brk -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exe
/work/dr/nightly/run/build_release-external-64/bin64/../clients/lib64/release/libdrmemtrace.so=0x0000000072000000
/work/dr/nightly/run/build_release-external-64/ext/lib64/release/libdrcovlib.so=0x0000000073800000
/work/dr/nightly/run/build_release-external-64/ext/lib64/release/libdrx.so=0x0000000077000000
/work/dr/nightly/run/build_release-external-64/ext/lib64/release/libdrreg.so=0x0000000078000000
/work/dr/nightly/run/build_release-external-64>
$ gdb --args bin64/drrun -t drcachesim -offline -- suite/tests/bin/simple_app
<...>
Program received signal SIGILL, Illegal instruction.
0x00007fe2ee19e5a6 in syscall_ready ()
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x0000000072015e1b in ?? ()
(gdb) add-symbol-file /work/dr/nightly/run/build_release-external-64/bin64/../clients/lib64/release/libdrmemtrace.so 0x0000000072007820
add symbol table from file "/work/dr/nightly/run/build_release-external-64/bin64/../clients/lib64/release/libdrmemtrace.so" at
.text_addr = 0x72007820
(y or n) y
Reading symbols from /work/dr/nightly/run/build_release-external-64/bin64/../clients/lib64/release/libdrmemtrace.so...Reading symbols from /work/dr/nightly/run/build_release-external-64/clients/lib64/release/libdrmemtrace.so.debug...done.
done.
(gdb) bt
#0 offline_instru_t::append_unit_header (this=0x47b4e5d0, buf_ptr=0xc72688f0 <error: Cannot access memory at address 0xc72688f0>, tid=0)
at /work/dr/nightly/src/clients/drcachesim/tracer/instru_offline.cpp:284
#1 0x0000000072011c2c in drmemtrace_client_main (id=<optimized out>, argc=<optimized out>, argv=<optimized out>)
at /work/dr/nightly/src/clients/drcachesim/tracer/tracer.cpp:1810
#2 0x00007fe2ee121c1b in instrument_init () at /work/dr/nightly/src/core/lib/instrument.c:733
#3 0x00007fe2ee0b1cc8 in dynamorio_app_init () at /work/dr/nightly/src/core/dynamo.c:680
#4 0x00007fe2ee196487 in privload_early_inject (sp=0x7ffdc7269360, old_libdr_base=<optimized out>, old_libdr_size=<optimized out>)
at /work/dr/nightly/src/core/unix/loader.c:1947
#5 0x00007fe2ee178065 in reloaded_xfer ()
#6 0x0000000000000001 in ?? ()
#7 0x00007ffdc726b62b in ?? ()
#8 0x0000000000000000 in ?? ()
(gdb) x/8i $pc
=> 0x72015e1b <offline_instru_t::append_unit_header(unsigned char*, int)+11>: movzbl 0x7(%rsi),%eax
0x72015e1f <offline_instru_t::append_unit_header(unsigned char*, int)+15>: mov %rsi,%rbx
0x72015e22 <offline_instru_t::append_unit_header(unsigned char*, int)+18>: lea 0x8(%rbx),%r12
0x72015e26 <offline_instru_t::append_unit_header(unsigned char*, int)+22>: and $0x1f,%eax
0x72015e29 <offline_instru_t::append_unit_header(unsigned char*, int)+25>: or $0xffffff80,%eax
0x72015e2c <offline_instru_t::append_unit_header(unsigned char*, int)+28>: mov %al,0x7(%rsi)
0x72015e2f <offline_instru_t::append_unit_header(unsigned char*, int)+31>: callq 0x72015510 <instru_t::get_timestamp()>
0x72015e34 <offline_instru_t::append_unit_header(unsigned char*, int)+36>: movzbl %ah,%edx
(gdb) info reg
rax 0x7222ba78 1914878584
rbx 0xc72688f0 3341191408
rcx 0x0 0
rdx 0x0 0
rsi 0xc72688f0 3341191408
rdi 0x47b4e5d0 1203037648
rbp 0x9fff 0x9fff
rsp 0x7ffdc7268290 0x7ffdc7268290
r8 0x0 0
r9 0x1 1
r10 0x1 1
r11 0x47abde00 1202445824
r12 0x7fff 32767
r13 0x47b4e5d0 1203037648
r14 0x2 2
r15 0x0 0
rip 0x72015e1b 0x72015e1b <offline_instru_t::append_unit_header(unsigned char*, int)+11>
eflags 0x10206 [ PF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) x/4wx $rsi
0xc72688f0: Cannot access memory at address 0xc72688f0
Looking at the addresses it looks like 32-bit truncation to me of the "buf" local var in tracer.cpp:
rsi 0xc72688f0
rsp 0x7ffdc7268290
We've confirmed we can reproduce with PR #3463, and there is no crash without, on other machines as well, so I'm triaging this off to the author.