Hasty Briefsbeta

Bilingual

The strange case of the underestimated Merge Join node

4 months ago
  • #Query Optimization
  • #PostgreSQL
  • #Performance Tuning
  • A customer reported a query that was slow on first execution after a batch process but fast on subsequent runs, with different execution plans.
  • Initial suspicion was autoanalyze not running post-batch, but tables weren't vacuumed or analyzed between executions.
  • The query involved a LEFT JOIN between two tables with non-overlapping histograms for the join columns.
  • First execution used a Merge Join with high cost due to planner's estimation based on incomplete histogram data.
  • Second execution used a Nested Loop Join, benefiting from cleaned index tuples providing accurate extreme values.
  • The planner's get_actual_variable_endpoint() function aborts early on first execution, leading to inaccurate estimates.
  • A script was provided to reproduce the scenario, demonstrating the issue with dead heap tuples affecting index scans.
  • The case highlights a rare scenario where query plans change without data or statistics updates, due to index tuple cleanup.