add accounting stats for vmm blocks
Most of the numerous memory usage statistics in DR were developed for Windows where reserving vs committing memory are very different things. This results in misleading and confusing stats on Linux:
$ bin64/drrun -loglevel 1 -- suite/tests/bin/simple_app
$ egrep -i 'peak.*guard|peak.*bytes|peak.* in use|peak.*combin' `ls -1td logs/*0|head -1`/s* | tail -n 18
Peak fcache combined capacity (bytes) : 69632
Peak special heap capacity (bytes) : 45056
Peak heap align space (bytes) : 2957
Peak heap bucket pad space (bytes) : 13576
Peak guard pages, reserved virtual pages : 52
Peak stack capacity (bytes) : 172032
Peak mmap capacity (bytes) : 106496
Peak mmap reserved but not committed (bytes) : 167936
Peak heap claimed (bytes) : 555438
Peak heap capacity (bytes) : 630784
Peak heap reserved but not committed (bytes) : 331792
Peak vmm virtual memory blocks in use : 97
Our peak virtual memory in use (bytes) : 1589248
The stats seemingly don't add up: in 100K units, we have 15 total allocated in vmm, yet the stats imply 0.7 cache + 6.3 heap + 1.7 stack + <1 for TLS and gencode (mmap includes cache), which is <9. Where are the other 6? The 'heap reserved but not committed' gives a hint: there's 3 right there. Plus the guard pages: there's 2 there. Yet on Linux a guard page and a "reserved" page cost as much as a committed page, so it would be good to have some stats that assign those to their categories: how many guard pages were for the code cache, e.g.
Maybe the live units is a better measure:
Peak fcache units on live list : 4
Peak special heap units : 4
Peak heap units on live list : 12
If 64K each (56K + guard pages), that's 2.5 cache + 2.5 special + 7.5 heap => 12.5, which is much closer.
For Linux it seems better to pass a vmm-accounting flag in and count up the blocks for each type: cache, heap, stack, special, mmap. Separate out guard pages? No, b/c often sub-vmm-block-size.
Downsides:
- cost of passing extra params in release build: though this is on coarse-grain events so not as expensive as heap alloc accounting
- not as needed on Windows: though could split further into reserve vs commit
- not as useful for beyond-vmm
Results of a prototype implementation:
$ bin64/drrun -- suite/tests/bin/simple_app
Peak vmm blocks for heap : 54
Peak vmm blocks for cache : 16
Peak vmm blocks for stack : 12
Peak vmm blocks for special heap : 11
Peak vmm blocks for special mmap : 4
Peak our virtual memory blocks in use : 97
Now we have a precise breakdown of the 97 blocks (16K each, so 1.5M).