Introduction to PostgreSQL Indexes
3 months ago
- #postgresql
- #database
- #performance
- PostgreSQL indexes are designed to speed up data access by reducing disk reads.
- Indexes enforce constraints like primary keys and unique keys but only help queries matching the index columns and data types.
- Indexes are beneficial when queries return less than 15-20% of the table; otherwise, a sequential scan might be preferred.
- PostgreSQL stores table data in heap files divided into 8KB pages, with rows (tuples) stored unordered.
- Indexes link key values to row locators (ctid) in the heap, enabling faster data retrieval.
- Creating an index on a column can drastically reduce query execution time by avoiding full table scans.
- Indexes come with costs: increased disk space, overhead on write operations (INSERT, UPDATE, DELETE), and memory usage.
- PostgreSQL offers six default index types (Btree, Hash, BRIN, GIN, GiST, SP-GiST) with more available via extensions.
- Btree is the most common index type, supporting primary/unique keys and efficient O(log n) searches.
- Multi-column indexes improve performance but require careful column ordering to maximize utility.
- Partial indexes reduce size and overhead by indexing only a subset of rows based on a condition.
- Covering indexes enable index-only scans by including all columns needed for a query.
- Expression indexes allow indexing transformed data (e.g., lowercased text) for efficient querying.
- Hash indexes are compact and fast for equality checks but lack support for ordering or multi-column use.
- BRIN indexes are space-efficient for large, append-only datasets with correlated data.
- GIN indexes excel at searching composite data (e.g., arrays, JSON, text).
- GiST and SP-GiST are frameworks for custom index types, useful for geometric and full-text search.