Hasty Briefsbeta

Bilingual

How we made WINDOW JOIN parallel and vectorized

4 days ago
  • #Database Performance
  • #QuestDB
  • #Window Join Optimization
  • QuestDB introduced a specialized WINDOW JOIN operator to efficiently aggregate one table over a time window around each row of another, improving performance for temporal queries like attaching average bid and ask prices to trades within a 1-second window.
  • The operator leverages data-level parallelism by slicing the left-hand side (LHS) table into page frames processed by worker threads, each needing a slice of the right-hand side (RHS) table determined by binary search due to timestamp-ordered storage.
  • A fast path for low-cardinality equality joins (e.g., symbol matching) copies RHS values into contiguous per-key buffers, enabling SIMD vectorized aggregation kernels (e.g., sum, avg, min, max) that process eight doubles per iteration on AVX2, significantly speeding up the inner loop.
  • Benchmarks against Timescale, DuckDB, and ClickHouse on a 50M-row trades and 150M-row prices dataset showed QuestDB's parallel + SIMD implementation running 5.0x faster than its single-threaded version and 25x faster than ClickHouse's best rewrite, with DuckDB and Timescale failing to complete within time or disk limits.
  • The operator supports various use cases including one-sided windows, INCLUDE/EXCLUDE PREVAILING options, and multiple aggregates, and is part of a broader effort to make temporal operators first-class in QuestDB, with HORIZON JOIN already available.