Nobody likes lag: How to make low-latency dev sandboxes
3 months ago
- #performance
- #cloud-dev
- #latency
- Developers expect instant response when typing in terminals or editors.
- Initial architecture had poor startup time (10-30 seconds) and high latency (>200ms).
- Security was acceptable with agents not exposed to the public internet.
- Implemented a warm pool to reduce startup time from 30s to 50ms.
- Removed middlemen by connecting clients directly to machines, improving latency.
- Used JWT for authorization and moved billing/persistence responsibilities to LLM router.
- Adopted fly.io's fly replay for efficient request routing.
- Expanded to multi-regional pools to further reduce latency (14ms average).
- Final architecture is simpler and more performant by removing unnecessary infrastructure.