Adds work-arounds for an AMD processor bug where the processor does not clear the hidden gs base when the gs selector is written. Pre-4.7 Linux kernels leave the prior thread's base in place on a switch due to this. When we attach and receive SIGUSR2 in a new thread, we can thus get the wrong dcontext; worse, we can get NULL in the handler but the wrong dcontext later during init.
To solve the problem on attach, we check the tid for threads receiving a takeover signal. For incorrect tid cases or unknown threads we set a non-zero "pre-init" value (the kernel ignores zero) in the gs base. We have to be careful to not clobber the valid gs base value of a temporarily-native thread whose magic field was deliberately set to invalid.
On detach, we now set a non-zero value rather than zero in the gs base.
When sending a thread native or cloning a child thread, we're already leaving a copy of our base in place (with an invalid .magic field), which eliminates any problems there.
Manually tested on an AMD machine with an older kernel: api.startstop succeeded 1000 times in a row where before it failed every single run.
Fixes #3356 (closed)