Adds support for analyzing traces in parallel, concurrently operating on each traced thread. This is made possible by the new storage of traces in separate per-thread files.
Adds a new analysis_tool_t interface where if the tool's parallel_shard_supported() returns true, analyzer_t switches to a parallel operation mode. Today, a simple static scheduling among worker threads is used. Each worker completely owns one or more shards, eliminating the need for synchronization when processing a shard's trace entries. The default shard is today's trace file split, a traced thread.
A tool's parallel_shard_init() function is invoked to create shard-thread-local data, which is passed to parallel_shard_memref(). Errors are also shard-local with parallel_shard_error(). A parallel_shard_exit() is provided for cleanup, though most tools will sort, aggregate, and clean up in print_results().
Implements the new interface in the basic_counts and opcode_mix tools. More tools will be converted in the future.
Adds a new routine module_mapper_t::find_mapped_trace_bounds() so opcode_mix can perform local caching of mappings, to avoid a global lock bottleneck for module_mapper_t usage.
Issue: #3230 (closed)