Slow attach times due to protecting stack/heap allocations
Created by: Carrotman42
NOTE: the following describes an incorrect solution to the problem of slow attach times. See further down for details about this issue.
During attach on Unix, the thread executing os_take_over_all_unknown_threads:
- obtains the thread_initexit_lock
- (does some other stuff)
- calls thread_signal to all other threads
- unlocks thread_initexit_lock
- waits on each threads' "event" one at a time until all threads are suspended
All threads obtain thread_initexit_lock during their attach logic, and I believe signaling all threads and then unlocking the lock causes contention on thread_initexit_lock. With 3000 threads in my simple repro, attaching takes 70 seconds whereas the following implementation finishes in under a second:
- obtains the thread_initexit_lock
- (does some other stuff)
- unlocks thread_initexit_lock
- for each thread: 4.1) signal that thread 4.2) wait for that thread's "event"