How we made WINDOW JOIN parallel and vectorized
7 days ago
- #Performance Optimization
- #QuestDB
- #WINDOW JOIN
- QuestDB introduced a dedicated WINDOW JOIN operator to efficiently aggregate one table over a time window around each row of another, simplifying complex SQL queries that previously required multiple joins and unions.
- The operator achieves high performance through data-level parallelism, where the left-hand side (LHS) table is split into page frames processed by worker threads, and a low-cardinality fast path that enables SIMD vectorized aggregation.
- Key optimizations include building per-key indexes and contiguous value buffers for RHS data, allowing binary searches for window bounds and efficient SIMD kernels for aggregates like sum, avg, min, and max.
- Benchmarks against Timescale, DuckDB, and ClickHouse on a 50M-row trades and 150M-row prices dataset showed QuestDB's parallel + SIMD implementation running 5x faster than its single-threaded version and 25x faster than ClickHouse's best rewrite.
- The WINDOW JOIN operator supports various configurations, including one-sided windows, include/exclude prevailing rows, and multiple aggregate functions, with plans to expand temporal operators like HORIZON JOIN in future releases.