investigate frequency scaling cost of DR's ymm register preservation
Today, if the processor supports AVX, DR unconditionally preserves the full ymm registers on context switches. #639 covers doing so conditionally, but it's not a trivial thing to implement. This issue covers investigating whether and how much impact the use of ymm-register-to-memory instructions has, to help prioritize #639 work. Non-floating-point AVX instructions are considered "lightweight" and may have little impact, though anything touching the top half of the ymm registers supposedly has some cost.
Xref https://en.wikichip.org/wiki/intel/frequency_behavior Xref https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/ Xref https://indico.cern.ch/event/327306/contributions/760669/attachments/635800/875267/HaswellConundrum.pdf#page=8
What may make this difficult to investigate is that the performance effects of AVX frequency scaling are difficult to measure on microbenchmarks; they affect neighboring cores; they vary by microarchitecture; they have the most impact on large systems under heavy load.