How TimescaleDB compresses time-series data

5 hours ago

TimescaleDB achieves up to 98% compression for time-series data using hypercore, a hybrid row-columnar engine with specialized algorithms like delta encoding and run-length encoding.
Compression in TimescaleDB targets cross-row patterns, unlike PostgreSQL's TOAST which handles individual large values, leading to higher ratios (e.g., 10-100×) for numeric and timestamp columns.
Data is organized into batches of ~1000 rows in columnar format after compression, optimizing storage and query performance by reducing I/O and enabling vectorized execution.
Key parameters segmentby and orderby determine row grouping and sorting, crucial for efficient compression and query filtering, with segmentby ideally having 100-10,000 unique values per chunk.
Compression typically speeds up queries like range scans and aggregations but may slow down point lookups or updates on compressed data.
Implementation involves configuring compression policies (e.g., using segmentby='machine_id' and orderby='time DESC') and setting automatic compression for older chunks.

Hasty Briefsbeta