Fixes a bug where the trailing magic word was missing from the signal frame, causing the kernel to zero out rather than restore AVX+ state. We copy the whole extended state at once now, rather than piecemeal, with checks for both magic words.
Fixes os_forge_exception() to properly set the fpstate magic words and sizes.
Disables the 32-bit xsave in save_xmm(), which causes failures on recent processors, in order to better test the changes here for 32-bit. A complete fix is still needed there but that is i#3256.
Updates comments to clarify the confusing extended_size vs xstate_size and magic word locations for 64-bit and 32-bit.
Augments the linux.sigcontext test to test that SIMD state is preserved across sigreturn.
Augments the api.detach_state test to test that ymmh, zmm, and opmask state is preserved across detach.
Issue: #3812 (closed), #3256 Fixes #3812 (closed)