How to reproduce and fix an I/O data race with Go and DTrace
3 days ago
- #Race Condition
- #TOCTOU
- #Go
- A test failure in CI was traced to a data race on a file due to improper synchronization between writing and reading goroutines.
- The issue was a TOCTOU (Time of Check to Time of Use) problem where the file was checked for existence before reading, but the write operation hadn't completed.
- A minimal Go reproducer demonstrated the race condition, with the read operation sometimes seeing an empty or partial file.
- DTrace was used to observe system and Go function calls, revealing the interleaving of write and read operations.
- The `chill` DTrace action was employed to simulate disk latency, increasing the likelihood of the race condition.
- The fix involved removing the redundant `os.Stat` check and retrying the read operation until successful, eliminating the TOCTOU issue.
- The article emphasizes the unnecessary use of `stat(2)` before file operations and its potential to introduce TOCTOU bugs.
- The `chill` DTrace action is highlighted as a useful tool for simulating latency and detecting race conditions.