Locality, and Temporal-Spatial Hypothesis

a day ago

Copy Link

Andres Freund discussed async IO in Postgres 18 at PGConf NYC, highlighting performance differences between forward and reverse scans due to read ahead optimizations.
The temporal-spatial locality hypothesis suggests data written at similar times will be read at similar times and should be stored near each other.
Streaming and time-series systems optimize based on the temporal-spatial hypothesis, while hash-based databases like DynamoDB reject it.
Relational schemas using UUIDs or large surrogate keys also reject the hypothesis, impacting read performance by reducing spatial locality.
Schemas with time-ordered primary keys (e.g., SERIAL, AUTO_INCREMENT) may increase write contention without read benefits, leading to database optimizations.
The hypothesis is weakly true for OLTP workloads, with recent keys being hotter, but requires careful schema design for optimal performance.

Hasty Briefsbeta