Hasty Briefsbeta

Bilingual

How TimescaleDB compresses time-series data

5 hours ago
  • #compression
  • #timescaledb
  • #postgresql
  • TimescaleDB achieves up to 98% compression for time-series data using hypercore, a hybrid row-columnar engine with specialized algorithms like delta encoding and run-length encoding.
  • Compression in TimescaleDB targets cross-row patterns, unlike PostgreSQL's TOAST which handles individual large values, leading to higher ratios (e.g., 10-100×) for numeric and timestamp columns.
  • Data is organized into batches of ~1000 rows in columnar format after compression, optimizing storage and query performance by reducing I/O and enabling vectorized execution.
  • Key parameters segmentby and orderby determine row grouping and sorting, crucial for efficient compression and query filtering, with segmentby ideally having 100-10,000 unique values per chunk.
  • Compression typically speeds up queries like range scans and aggregations but may slow down point lookups or updates on compressed data.
  • Implementation involves configuring compression policies (e.g., using segmentby='machine_id' and orderby='time DESC') and setting automatic compression for older chunks.