Adds a timestamp marker and a cpu marker to the trace buffer header for each thread's buffer unit output. The timestamp was already in the raw offline trace format, but now it is in the final trace for both offline and online as a new marker type. The cpu is completely new and specifies which core that thread executed on at the granularity of the buffer unit.
Refactors the initial and per-output buffer headers to fix warts in the tracer: now the initial buffer's timestamp is from output time rather than thread init time; the initial header is more cleanly skipped for virt2phys; header uses are more normalized and easier to understand.
Updates the basic_counts tool to separate these new scheduling marks from kernel transfer and other markers. Updates the corresponding documentation.
A forthcoming change will update the cache simulator to schedule threads based on executed cores rather than a thread round-robin scheme.
Issue: #2843 (closed)