Hasty Briefsbeta

Bilingual

What 100k concurrent sandboxes has taught us so far

3 days ago
  • #benchmarking
  • #serverless
  • #scalability
  • The Scale Invitational's original goal was to test how providers handle spinning up tens of thousands of sandboxes simultaneously.
  • Initial attempts using a single VM for 10,000 sandboxes revealed that the test harness itself became a bottleneck, skewing results.
  • To better simulate real workloads, the architecture was redesigned to use sharding, distributing the load across multiple VMs.
  • A balance of about 100 iterations per shard was found to avoid individual VM bottlenecks while keeping fleet size manageable.
  • A key insight was the difference between measuring throughput (creating sandboxes quickly) and true concurrency (sustaining many sandboxes alive simultaneously).
  • The test was adjusted to keep sandboxes alive until peak concurrency, introducing result categories like 'partial' to indicate sandboxes that died prematurely.
  • Log aggregation was implemented across shards to enable debugging at scale, storing logs in durable object storage.
  • A data pipeline using Tigris for cold storage and Clickhouse for analytics was set up to handle the large volume of results.
  • Due to these complexities and iterative improvements, the event was postponed to June 17th to ensure accurate and meaningful results.