Hasty Briefsbeta

Bilingual

Keep Deterministic Work Deterministic

8 hours ago
  • #LLM pipelines
  • #deterministic programming
  • #AI reliability
  • The article discusses the challenges of reliability in LLM-based systems, using a blackjack simulation as an example.
  • Early runs of the simulation had a 37% pass rate, with errors compounding due to miscounts and rule violations.
  • The 'March of Nines' concept is introduced, illustrating the increasing effort required to improve system reliability from 90% to 99% and beyond.
  • An exercise demonstrates cascading failures in LLMs, showing how small errors in early steps can lead to significant deviations in the final result.
  • The article highlights the importance of making deterministic work deterministic, using code instead of LLMs for tasks like arithmetic and rule validation.
  • Iterative improvements to the blackjack pipeline, including restructuring data, using chain of thought, and replacing LLM validators with code, increased the pass rate to 94%.
  • The key takeaway is to identify and remove deterministic tasks from LLM pipelines, using code for such tasks to achieve higher reliability.