Hasty Briefsbeta

Bilingual

The case of the UI thread that hung in a kernel call

a year ago
  • #debugging
  • #thread-suspension
  • #deadlock
  • A customer reported a UI thread hang that couldn't be diagnosed due to the stack being paged out.
  • The thread was suspended for over five hours, but no debugger was attached to explain the suspension.
  • A watchdog thread within the same process was found to suspend the UI thread to capture stack traces, leading to a deadlock.
  • The deadlock occurred because the UI thread held a lock needed by the watchdog thread to capture the stack trace.
  • Suspending threads within the same process risks deadlock if the suspended thread holds resources needed by others.
  • The solution is to use an external process for watchdog functionality to avoid deadlocks.
  • The kernel delays thread suspension to avoid interrupting critical operations, but this doesn't prevent user-mode deadlocks.
  • Microsoft's design choices around thread suspension and loader locks were criticized, but the root issue was the in-process watchdog.