Hasty Briefsbeta

A Case Study in Rewriting a Critical Service in Rust

9 days ago
  • #Performance Optimization
  • #Rust vs Go
  • #Cost Savings
  • A critical payment service at TikTok, written in Go, became CPU-bound due to high traffic, leading to scalability issues and high operational costs.
  • The solution involved rewriting the most CPU-intensive API endpoints in Rust while keeping the rest of the service in Go, leveraging Rust's performance and memory efficiency.
  • The Rust implementation was rigorously tested for correctness in shadow mode, ensuring 100% data consistency with the original Go service.
  • Stress testing revealed the Rust service handled 2x the traffic of the Go service with lower latency and significantly reduced CPU and memory usage.
  • The performance improvements led to a projected annual cost saving of nearly $300,000 by reducing the required compute cores by over 400 vCPU.
  • The project highlighted the importance of using the right tool for the job, with Go remaining ideal for most services and Rust for CPU-bound bottlenecks.
  • Key metrics showed a 33.6% lower CPU usage, 72% lower memory usage, and 76% lower p99 latency in the Rust service compared to the Go service.