Because relocating can invoke an ifunc which accesses TLS, for UNIX we move loader_init() to after thread init. The custom delayed init function calling is moved back to privload finalization. The ELF TLS block setup for the primary thread is moved up to TLS block discovery time and is moved to after relocation (but before init function calling). Windows remains unchanged as it has other ordering requirements (i#338).
Fixes #2751 (closed) Issue: #338 (closed)