Are compilers deterministic?
3 days ago
- #reproducible-builds
- #LLM-coding
- #compiler-determinism
- Betteridge's law suggests that the answer to most headlines posed as questions is 'no', which applies to normal developer experiences.
- There's a distinction between the computer science perspective (compilers are deterministic based on full input state) and the engineering perspective (real builds often don't control all inputs, leading to output drift).
- Ksplice's work in the 2000s involved patching running Linux kernels without reboots, highlighting practical issues with compiler output vs. source intent.
- Compiler output can vary due to factors like register allocation, pass behavior, and section/layout changes, even with the same source intent.
- GCC bug 18574 illustrates how pointer-hash instability can affect traversal order and SSA coalescing.
- Debian and the reproducible-builds effort (since 2013) aim for bit-for-bit identical artifacts from the same source and build instructions.
- Practical steps for reproducible builds include setting TZ=UTC, LC_ALL=C, using SOURCE_DATE_EPOCH, and specific compiler flags.
- The discussion extends to LLMs, questioning if vibecoding is sane given their nondeterminism, contrasting CS and engineering answers.
- Engineering relies on controlled interfaces, test oracles, reproducible pipelines, and observability, not perfect determinism.
- LLM-assisted coding balances the nondeterminism of AI with the need for control boundaries and output verification.