How we run Firecracker VMs inside EC2 and start browsers in <1s
6 hours ago
- #performance optimization
- #virtualization
- #cloud browsers
- Cloud browser sessions were reduced from $0.06 to $0.02 per hour while improving speed.
- Traditional virtual machines (VMs) were too slow and expensive for cloud browsers.
- Firecracker, a lightweight VM system, was adopted to run each browser session in its own VM.
- Firecracker is typically run on bare-metal servers, but the company uses nested VMs on regular EC2 for cost and scalability.
- Previously, unikernels were used but lacked autoscaling, leading to manual intervention during traffic spikes.
- A custom control plane was built to monitor browser fleet and dynamically scale VMs.
- Nested virtualization introduces latency, optimized by using larger memory pages and custom page fault handling.
- VM resume time reduced from 9.8 seconds to 3.1 seconds, page faults decreased by 91x.
- Chromium startup bottleneck addressed by unpinning CPUs during launch, then pinning after readiness.
- Browser stealth improved with Chromium fork and fingerprinting, achieving 81-84.8% block avoidance.
- Cold start under 400ms, browser create latency 825ms p50, 1.35s p99.
- Next goal: snapshot VM after Chromium starts to skip browser startup entirely.