Hasty Briefsbeta

Bilingual

An almost catastrophic OpenZFS bug and the humans that made it

10 months ago
  • #Software Bugs
  • #OpenZFS
  • #Rust
  • A critical bug was found in OpenZFS's `vdev_raidz_asize_to_psize` function, which incorrectly returned the input `asize` instead of the calculated `psize`.
  • The bug could lead to data corruption by writing past allocated disk space, a silent and dangerous failure.
  • The bug was discovered during testing with aggressive allocator fragmentation settings, highlighting the importance of thorough testing.
  • Static analyzers in C could detect the unused variable `psize`, but such tools are not commonly integrated into everyday workflows due to their cost and false positives.
  • Rust's type system could prevent such bugs by distinguishing between `PhysicalSize` and `AllocatedSize` types, making accidental swaps a compile-time error.
  • The discussion reflects on the limitations of human error detection and the value of tooling to catch such mistakes, rather than relying solely on programmer competence.
  • The narrative challenges the notion that 'competent programmers don't need tools,' emphasizing that even experienced developers can overlook subtle bugs.
  • The author expresses a nuanced view on Rust, appreciating its safety features while acknowledging the learning curve and potential mismatches for certain tasks.