Hasty Briefsbeta

Bilingual

Reading the internals of Postgres: Database cluster, databases, and tables

3 days ago
  • #PostgreSQL
  • #Storage
  • #Database Internals
  • PostgreSQL uses a 'database cluster' to refer to a single instance managing multiple databases, not multiple servers.
  • Databases and other objects are identified by OIDs (unsigned integers), with user-created objects starting from OID 16384.
  • System catalogs like pg_class and pg_database store metadata and are regular tables that can be queried.
  • The physical structure centers on the data directory ($PGDATA), containing subdirectories for each database (named by OID) and tables.
  • Tables and indexes are stored in files under base/{database_oid}/{relfilenode}, with forks for free space and visibility maps.
  • VACUUM FULL can change a table's relfilenode, decoupling it from its OID.
  • Tablespaces allow storing database objects in custom directories, with symlinks in $PGDATA/pg_tblspc/.
  • Heap tables consist of 8KB pages containing headers, line pointers, and tuples stored from the page end.
  • TOAST handles large values by storing them externally in a separate table, chunked and indexed.
  • Writing tuples involves updating page headers and line pointers, with alignment considerations.
  • Reading tuples can use sequential scans or index scans (e.g., B-tree), depending on the query.