Is OSM Partially Down?
10 months ago
- #OpenStreetMap
- #PostgreSQL
- #Database Replication
- OpenStreetMap planet database's replication pipeline halted on June 26, 2025, due to PostgreSQL's 1 GB per-field size limit being exceeded.
- Logical replication failed when processing an oversized record, causing pg_logical_slot_peek_changes to fail repeatedly.
- A PostgreSQL patch deployed on May 30 fixed a bug but introduced an exponential feedback loop in invalidation messages, leading to memory issues.
- PostgreSQL developers addressed the issue in a follow-up commit, capping invalidation messages at 8MB per transaction, with a fix scheduled for August 14, 2025.
- The OSM operations team put the site into read-only mode, created a fresh database dump, and generated 'fake logs' to rebuild missing diffs.
- Edited logs were synced to S3, missing minutely diffs were published, and replication was re-enabled after confirming no errors.