Locality, and Temporal-Spatial Hypothesis
a day ago
- #temporal-spatial locality
- #Postgres
- #database performance
- Andres Freund discussed async IO in Postgres 18 at PGConf NYC, highlighting performance differences between forward and reverse scans due to read ahead optimizations.
- The temporal-spatial locality hypothesis suggests data written at similar times will be read at similar times and should be stored near each other.
- Streaming and time-series systems optimize based on the temporal-spatial hypothesis, while hash-based databases like DynamoDB reject it.
- Relational schemas using UUIDs or large surrogate keys also reject the hypothesis, impacting read performance by reducing spatial locality.
- Schemas with time-ordered primary keys (e.g., SERIAL, AUTO_INCREMENT) may increase write contention without read benefits, leading to database optimizations.
- The hypothesis is weakly true for OLTP workloads, with recent keys being hotter, but requires careful schema design for optimal performance.