[perf] have memquery iter use memcache to avoid beyond-vmheap slowdowns
Running larger apps (with start/stop attach), I see significant performance issues once we're beyond the vm reservation (often I'm disabling reset, but I've seen it even with reset enabled). Is the problem that we're beyond the vm reserve while still initializing tons of threads, and all those global allocs each walk the maps file? Should we have Linux memquery use allmem to avoid reading the large maps file?
The maps file can be quite large:
# wc /proc/84007/maps
9330 47301 483328 /proc/84007/maps
Sample callstack:
Thread 321 (Thread 0x7f0c85b20700 (LWP 95809)):
#0 0x00007f0cabc7a47e in syscall_ready () at dynamorio/trunk/core/dynamo.c:2905
#1 0x00007f0c4c1aaa50 in ?? ()
#2 0x00007f0cabd19e85 in memquery_iterator_next (iter=0x3, iter@entry=0x7f0c4c1aaa50) at dynamorio/trunk/core/unix/memquery_linux.c:218
#3 0x00007f0cabd1eb75 in find_free_memory_in_region (start=start@entry=0x7f0c44846000 "", end=end@entry=0x7f0cc4847000 <myvals+3968> "", size=size@entry=65536,
found_start=found_start@entry=0x7f0c4c1aab48, found_end=0x0) at dynamorio/trunk/core/unix/os.c:2925
#4 0x00007f0cabd200ea in os_heap_reserve_in_region (start=0x7f0c44846000, end=end@entry=0x7f0cc4847000 <myvals+3968>, size=size@entry=65536,
error_code=error_code@entry=0x7f0c4c1aac28, executable=executable@entry=0 '\000') at dynamorio/trunk/core/unix/os.c:2961
#5 0x00007f0cabcd0de4 in vmm_heap_reserve (size=size@entry=65536, error_code=error_code@entry=0x7f0c4c1aac28, executable=executable@entry=0 '\000')
at dynamorio/trunk/core/heap.c:1158
#6 0x00007f0cabcd0ff9 in get_guarded_real_memory (reserve_size=65536, reserve_size@entry=57344, commit_size=commit_size@entry=4096, prot=prot@entry=3,
add_vm=add_vm@entry=0 '\000', guarded=guarded@entry=1 '\001', min_addr=min_addr@entry=0x0) at dynamorio/trunk/core/heap.c:2023
#7 0x00007f0cabcd16c5 in heap_create_unit (size=57344, tu=<optimized out>, must_be_new=0 '\000') at dynamorio/trunk/core/heap.c:2767
#8 0x00007f0cabcd1aaf in common_heap_alloc (tu=0x7f0c4b525f10, size=<optimized out>) at dynamorio/trunk/core/heap.c:3471
#9 0x00007f0cabcc5000 in fragment_create_heap (flags=134219264, indirect_exits=0, direct_exits=2, dcontext=0x7f0c4b525940) at dynamorio/trunk/core/fragment.c:2280
#10 fragment_create (dcontext=dcontext@entry=0x7f0c4b525940, tag=tag@entry=0x7f0cba1cf3b0,
body_size=body_size@entry=24, direct_exits=2, indirect_exits=0, exits_size=0, flags=134219264) at dynamorio/trunk/core/fragment.c:2383
#11 0x00007f0cabcb87b2 in emit_fragment_common (dcontext=dcontext@entry=0x7f0c4b525940,
tag=tag@entry=0x7f0cba1cf3b0, ilist=0x7f0c4c1be9c8, flags=134219264,
vmlist=0x7f0c4c1c8d68, link_fragment=link_fragment@entry=1 '\001', add_to_htable=1 '\001', replace_fragment=0x0) at dynamorio/trunk/core/emit.c:652
#12 0x00007f0cabcb91e4 in emit_fragment_ex (dcontext=dcontext@entry=0x7f0c4b525940,
tag=tag@entry=0x7f0cba1cf3b0, ilist=<optimized out>, flags=<optimized out>,
vmlist=<optimized out>, link=link@entry=1 '\001', visible=1 '\001') at dynamorio/trunk/core/emit.c:1011
#13 0x00007f0cabc90179 in build_basic_block_fragment (dcontext=dcontext@entry=0x7f0c4b525940,
start=0x7f0cba1cf3b0, initial_flags=initial_flags@entry=0, link=link@entry=1 '\001',
visible=visible@entry=1 '\001', for_trace=for_trace@entry=0 '\000', unmangled_ilist=0x0) at dynamorio/trunk/core/arch/interp.c:5133
#14 0x00007f0cabcb4f04 in dispatch (dcontext=0x7f0c4b525940) at dynamorio/trunk/core/dispatch.c:206