Hasty Briefsbeta

Bilingual

Supertoast tables

12 hours ago
  • #S3 Storage
  • #Postgres
  • #Database Optimization
  • Postgres is the starting point for data storage in web applications due to its general-purpose nature and performance improvements.
  • Hatchet uses Postgres for task queues, but storing large JSON payloads in jsonb columns led to storage inefficiencies and autovacuum issues.
  • TOAST tables in Postgres handle large values but can cause high IOPS load during autovacuum operations.
  • A solution was developed to offload infrequently accessed payloads to S3, keeping hot payloads in Postgres for fast access.
  • The 'supertoast' table approach partitions data by day, offloading older payloads to S3 while maintaining references in Postgres.
  • Initial attempts using a WAL model for offloading were inefficient and costly, leading to a 'write-and-swap' approach.
  • The 'write-and-swap' method processes payloads in batches, compresses them, and stores them in S3 with pointers in Postgres.
  • Partitioning allows for efficient swapping of old and new data partitions without autovacuum pressure.
  • The system now handles hundreds of millions of payloads daily with reduced database CPU usage and S3 costs.