Preserves the scratch xmm register by spilling it before the scatter/gather instr expansion, and restoring it after. We use memory obtained from the thread-private heap, using dr_thread_alloc, as the spill slot.
Reserves TLS slot for every thread in drx, to store the allocated pointer to the xmm spill slot.
Extends drx's state restore to also restore the app value of the spilled xmm register.
An alternate approach was to extend drreg to make it capable of spilling xmm regs too, but that was overkill as the scatter/gather expansion requires only a small subset of that functionality that is very easily implemented as a set of manual spill and restore. When that support is available in drreg (#3844) we can use it here too; added a TODO.
Also extends the pure-asm scatter/gather test to verify that the scratch reg (which is xmm0 in the test, given that we always select the lowest-numbered available xmm reg as scratch) value is preserved across the scatter/gather instrs.
Also fixes an issue with comparison of xmm regs in the test. cmpss
writes
the result of comparison to the first reg, not the aflags. (We had replaced
vpcmpud
with cmpss
in 106bf952 due to some SIGILL
issues.)
Issues: #2985 (closed)