What Are Skiplists Good For?
2 days ago
- #database-optimization
- #data-structures
- #skiplists
- Skiplists are randomized data structures that serve as alternatives to binary search trees, offering similar interfaces and asymptotic complexities with O(log n) search times.
- They can be visualized as linked lists with 'express lanes', where higher-level lists skip over nodes to accelerate searches.
- Skiplists have practical applications in concurrency, enabling relatively simple lock-free implementations.
- At Antithesis, a generalization called a 'skiptree' solved a challenge involving tree-shaped queries in an analytic database like Google BigQuery.
- Skiptrees store hierarchical data across multiple SQL tables with columns for next-level ancestors and intermediate nodes, allowing ancestor lookups with fixed JOINs instead of recursive point lookups.
- This approach optimized query costs in BigQuery, which charged based on data scanned, by reducing scans to roughly twice a normal table scan.
- Skiptrees are related to skip graphs, a distributed data structure, highlighting the utility of exotic data structures in unexpected scenarios.
- A key advantage of skiplists is that even naive implementations offer adequate performance, making them practical for environments like SQL.