Hasty Briefsbeta

Bilingual

Nobody likes lag: How to make low-latency dev sandboxes

3 months ago
  • #performance
  • #cloud-dev
  • #latency
  • Developers expect instant response when typing in terminals or editors.
  • Initial architecture had poor startup time (10-30 seconds) and high latency (>200ms).
  • Security was acceptable with agents not exposed to the public internet.
  • Implemented a warm pool to reduce startup time from 30s to 50ms.
  • Removed middlemen by connecting clients directly to machines, improving latency.
  • Used JWT for authorization and moved billing/persistence responsibilities to LLM router.
  • Adopted fly.io's fly replay for efficient request routing.
  • Expanded to multi-regional pools to further reduce latency (14ms average).
  • Final architecture is simpler and more performant by removing unnecessary infrastructure.