Hasty Briefsbeta

Bilingual

5x perf increase on writes with FPW disabled in Postgres

2 days ago
  • #performance scaling
  • #database architecture
  • #Postgres optimization
  • Lakebase architecture decouples compute and storage, enabling performance optimizations impossible in monolithic Postgres.
  • Traditional Postgres uses Full Page Writes (FPW) to prevent data corruption from torn pages during crashes, but this inflates WAL volume by up to 15x.
  • In lakebase, compute is stateless with no local disk, eliminating the torn-page risk FPW addresses.
  • Disabling FPW naively could cause unbounded delta chains in storage, increasing read latency and resource use.
  • Image generation pushdown moves FPW-like image creation to the storage layer, generating full page images based on delta thresholds rather than checkpoints.
  • Benchmarks show throughput gains scaling with compute size: up to 4.5x+ on 32 vCPU instances and a 94% reduction in WAL traffic.
  • Production benefits include reduced WAL generation (e.g., from 30 MB/s to 1 MB/s), improved read latencies (p99 down by 30-50%), and higher ingestion throughput (e.g., 3x increase for one customer).
  • The optimization was rolled out globally without downtime using Postgres's XLOG_FPW_CHANGE mechanism, enhancing scalability and stability.
  • This is part of a broader effort to offload heavy tasks to scalable storage, eliminating Postgres write bottlenecks and improving performance.