A Couple Million Lines of Haskell: Production Engineering at Mercury
4 hours ago
- #Haskell
- #Production Engineering
- #Reliability
- Haskell is effective in production at Mercury despite a large codebase (2M lines) and many engineers learning the language on the job, due to operational practices beyond just language elegance.
- Reliability is viewed as adaptive capacity—systems that can absorb variation and degrade gracefully—rather than just preventing failures, which is crucial with rapid team growth and institutional knowledge churn.
- The type system is an operational aid that encodes institutional knowledge, making safe paths easy and dangerous operations hard to misuse, helping preserve knowledge as teams change.
- Purity is treated as a boundary enforced by interfaces (e.g., using ST monad), allowing dangerous operations like mutation within contained scopes rather than avoiding them entirely.
- Types are used to make correct operational procedures (e.g., transactional event publishing) the path of least resistance, embedding critical invariants so they can't be forgotten.
- Durable execution via Temporal replaces fragile state machines with workflows that handle retries, timeouts, and crashes deterministically, improving operational resilience.
- Domain errors are modeled as domain types (not transport-specific like HTTP statuses) and translated at boundaries, preventing abstraction leakage across different contexts (e.g., HTTP handlers vs. cron jobs).
- Type encoding tradeoffs: encode invariants that prevent silent corruption, use runtime checks for loud failures, avoid over-modeling the domain, and encapsulate complexity to avoid burdening teams.
- Design for introspection using records of functions or effect systems to allow instrumentation, tracing, and mocking without modifying library code, ensuring operational visibility.
- Production compromises exist: libraries use unsafePerformIO, testing gaps remain, and compromises accumulate; the key is disciplined containment and documentation of tradeoffs.
- Haskell's value in production emerges over months, not days, with benefits in refactoring, knowledge preservation, and reducing accidental bugs, though pragmatism and hiring for generalists (not just experts) are essential.