Hasty Briefsbeta

Bilingual

Query Engines: Push vs. Pull (2021)

a year ago
  • #performance
  • #query-engines
  • #database-systems
  • Push-based query engines push results to downstream operators, improving cache efficiency and enabling efficient processing of DAG-shaped plans.
  • Pull-based query engines use the Volcano or Iterator model, where consumers drive the system by requesting rows from operators.
  • Push-based systems decouple work from consumption, making them suitable for streaming systems like Flink or Materialize.
  • DAG-shaped plans are more efficiently handled in push-based systems due to better scheduling and lifetime management of rows.
  • Push-based systems naturally unroll into simpler code when compiled, which can improve performance.
  • Some algorithms, like merge join and LIMIT operators, are more challenging to implement in push-based systems.
  • Cyclic graphs are nontrivial in both models, but push systems like Naiad and Timely Dataflow have made progress in this area.
  • Modern analytic systems are increasingly exploring push models, though direct comparisons with pull models are rare.