support client threads
From derek.br...@gmail.com on February 24, 2009 10:56:06
this was PR 222812
as an alternative (and superset) to custom nudges PR 200067 we should add support for creating new transparent threads that can then poll files, etc.
from discussion: communication restrictions: DR or client can't just wait on thread client thread is less restricted than client executing in app threads: when it's not communicating w/ rest of client or DR, it can share libraries w/ app. usage: communication in ("push"): seems like custom nudge (PR 200067) can do the same work (can loop in nudge routine: make sure synch routines consider native: thread will kill itself before returning to code cache) w/o expectations of supporting many threads + communication between them. does it need synch? for push usage, say for adaptive instrumentation, calling dr_flush may be all it needs to do. 2nd usage: sideline: for that need to revive decode_fragment, replace_fragment, etc.
so, moving this case to later scheduling and considering it to mainly cover sideline-type parallel-analysis uses
adding file wait is part of PR 202946
Tim in PR 200067:
I've run into some issues with this. The easiest thing is that we're not well setup for a polling thread with no yield, sleep, or wait support, though that part is relatively easy to fix. The more problematic case is around synchronization issues. We can't consider the thread native, native threads only grab DR or client locks during short bounded defined periods that we can detect when we suspend (syscall interception code etc.) and are fine the rest of the time. A polling nudge thread on the other hand is in DR or Client code all the time, there is no good place to suspend it that we can easily detect.
If we ignore client locks and fix cases like PR 225020, then perhaps we could consider it ok to suspend the thread if it was in client dll code (as opposed to DR or ntdll via DR with special case handling of yield/sleep). For client locks we could try tracking them via our lock api routines, but that starts getting pretty messy. Alternatively we could suspend the nudge thread last, presumably at that point we wouldn't care about client locks anymore since they couldn't block anyone. Alternatively, we could try not suspending the nudge thread (at least if we were only targeting flush) since it's not using the cache. That would work if synch_all users were only using it to handle in cache threads, but some users (including flush it looks like) are using it to break (or at least not hold) locks, do unsafe data structure modifications etc. which we'd have to verify against all our API routines.
I think the only really workable thing above is to special case the nudge thread and suspend it last (so even if it's holding a client lock it can't block anyone else) and only when it's in the client dll code (225020 will be fixed soon and I don't know of any other problematic cases like that). Either that or not supporting using a nudge as a polling thread (i.e. synch is blocked till the nudge thread finishes).
post-nudge feature: What remains to be done is to get the sideline threads to use the client-owned thread synchronization support from the nudge work. Easiest is probably starting the sideline thread via an internal nudge (though we'll need an argument for that xref PR 231295.
thread creation status from t222812-etc-minifeatures tree:
- I put my preliminary stuff, which was tested, under CLIENT_SIDELINE for now: dr_create_client_thread(), dr_terminate_client_thread(), cleanup_and_terminate_client_thread, dr_thread_yield()
- didn't fully test native treatment with thread doing syscalls, etc.
- what about our check_sole_thread() checks
- NYI for linux for now: can borrow from create_thread() in x86/sideline.c, with a different stack freeing model: will need to tweak cleanup_and_terminate_client_thread to be os-neutral
- FIXME: provide cond vars, other thread utils? xref discussion above; here I had dr_thread_yield() which doesn't really solve anything.
- code comments:
- FIXME PR 210591: transparency issues:
-
- All dlls will be notifed of thread creation by DLL_THREAD_ATTACH
-
- The thread will show up in the list of threads accessed by
- NtQuerySystemInformation's SystemProcessesAndThreadsInformation structure.
- FIXME PR 202669: if the client leaves reservation space we should have
- the stack auto-expand.
- what will a stack overflow be reported as?
Original issue: http://code.google.com/p/dynamorio/issues/detail?id=41