Hasty Briefsbeta

How to reproduce and fix an I/O data race with Go and DTrace

3 days ago
  • #Race Condition
  • #TOCTOU
  • #Go
  • A test failure in CI was traced to a data race on a file due to improper synchronization between writing and reading goroutines.
  • The issue was a TOCTOU (Time of Check to Time of Use) problem where the file was checked for existence before reading, but the write operation hadn't completed.
  • A minimal Go reproducer demonstrated the race condition, with the read operation sometimes seeing an empty or partial file.
  • DTrace was used to observe system and Go function calls, revealing the interleaving of write and read operations.
  • The `chill` DTrace action was employed to simulate disk latency, increasing the likelihood of the race condition.
  • The fix involved removing the redundant `os.Stat` check and retrying the read operation until successful, eliminating the TOCTOU issue.
  • The article emphasizes the unnecessary use of `stat(2)` before file operations and its potential to introduce TOCTOU bugs.
  • The `chill` DTrace action is highlighted as a useful tool for simulating latency and detecting race conditions.