[AArch64] Inline instr counting for trace_after_instrs in drcachesim
Today for AArch64, instr counting for the -trace_after_instrs
option in drcachesim is not inlined. Instead it's done using a clean call. https://github.com/DynamoRIO/dynamorio/blob/5cbe8113c919b557f7bfc3daafffc9b1fef7938c/clients/drcachesim/tracer/tracer.cpp#L1430
We should inline it for better performance.