Adds static TLS support to the Windows private loader. This involves the following:
Swaps the TEB->ThreadLocalStoragePointer field between app and private states, with new dcontext fields to store the values.
Adds an os_privmod_data_t to the Windows loader to store TLS data. Parses PE TLS data fields, records the TLS callbacks, sets the TLS array index, and records the TLS initialization data. Calls TLS callbacks prior to regular entry points; warns on crashes but does not consider them fatal.
Handling TLS is simplified with a hardcoded maximum size and lack of support for dynamicaly-loaded-library TLS. This lets us allocate an array at thread init and not need any complex scheme to reallocate it. Prior to calling the thread init function for a library, its TLS data is allocated and initialized from its recorded initialization data.
Since both process init and process exit library functions expect TLS to be set up, reorders several loader sequences:
-
Partially unifies Windows and Linux by splitting loader_init() into loader_init_prologue() (for setting up the private PEB used by arch_init for gencode), called early, and loader_init_epilogue(), called after thread init so we have a dcontext and can set up the TLS. However, Linux still needs relocs and TLS after thread init while Windows is the reverse, so we have a split ordering in a newly-divided loader_init_{prologue,epilogue}(). This undoes the #338 (closed) special casing which is now removed.
-
Splits out instrument_exit_event() from instrument_exit() and moves instrument_exit() to after the final thread exit, to ensure we call the thread exit library functions. Removes instrument_exit_post_sideline() which is now merged into instrument_exit().
-
Adds loader_make_exit_calls() to enable calling both the thread exit and process exit library functions before TLS is freed (yes, process exit functions blindly de-reference TLS, just like process init functions do).
-
Delays process and thread init function calling until the statically-imported set of libraries is fully loaded, so we have that TLS array size.
Reverses the modlist iteration order for function calling, to properly call independent libraries first before their dependents.
Adds static TLS tests to the client.raw_tls test. Renames the test to client.tls, and changes the client to C++ to use std::vector and a custom class with the C++11 'thread_local' for TLS callback testing. However, the vector is disabled for Linux because it breaks that loader: i#4034. Plus, VS2013 doesn't support 'thread_local', so that is disabled until we upgrade to VS2017.
Issue: #338 (closed), #4002 (closed), #4030 (closed), #4034 Fixes #4030 (closed)