Hasty Briefsbeta

Bilingual

The real cost of random I/O

2 days ago
  • #Query Optimization
  • #PostgreSQL
  • #Database Performance
  • The default `random_page_cost` in PostgreSQL is set to 4.0, a value introduced ~25 years ago, which may not reflect modern storage realities.
  • Experiments show that the actual cost of random I/O on SSDs is significantly higher (~25-35) than the default, contradicting recommendations to lower it.
  • The discrepancy between cost and actual duration can lead to suboptimal query plans, especially in selectivity ranges where sequential and index scans intersect.
  • Bitmap scans mitigate some of the inefficiencies by making access more sequential and supporting I/O prefetching, which is not accounted for in the cost model.
  • Lowering `random_page_cost` might be justified in scenarios with high cache ratios or to avoid performance cliffs, despite the higher actual cost of random I/O.
  • Future improvements could include separating non-I/O costs from `random_page_cost`, better estimating cached data statistics, and incorporating prefetching into costing.