How does B-tree make your queries fast?

2 months ago

B-tree is a data structure that enables efficient searching through large amounts of data and is widely used in modern databases.
B-tree differs from Binary Search Tree (BST) by allowing multiple values per node and being self-balancing, which optimizes performance.
Hardware considerations like CPU caches, RAM, and disk storage influence the design of B-tree to minimize random access and maximize sequential access.
Sequential access is significantly faster than random access, especially on HDDs, making B-tree's design advantageous for disk-based storage.
B-tree reduces tree height by packing multiple values into nodes (pages), which decreases the number of disk reads required for queries.
Postgres uses 8kB pages, allowing ~375 values per page, enabling efficient storage and retrieval even for large datasets (e.g., 1 billion rows in 4 levels).
B-tree is self-balancing, ensuring consistent performance during insertions by splitting nodes and promoting values to parent nodes as needed.
Postgres optimizes splits to minimize free space, enhancing storage efficiency.
B-tree's design, focusing on hardware efficiency and self-balancing, has made it a durable and effective indexing structure for decades.

Hasty Briefsbeta