Railway GCP Account Suspension Incident Report
13 hours ago
- #outage
- #cloud-infrastructure
- #incident-response
- Railway experienced a platform-wide outage lasting approximately 8 hours on May 19-20, 2026, due to Google Cloud incorrectly suspending their production account.
- The suspension immediately disrupted GCP-hosted infrastructure, including the dashboard, API, control plane, and databases, causing 503 errors and login failures.
- As cached network routes expired, the outage cascaded to workloads on Railway Metal and AWS, leading to widespread 404 errors and rendering all regions unreachable.
- Recovery involved restoring GCP account access, persistent disks, compute instances, and networking, with services gradually restored over several hours.
- During recovery, GitHub rate-limited Railway's OAuth and webhook integrations due to a surge in retry requests, temporarily blocking logins and builds.
- Railway takes responsibility for architectural dependencies that allowed a single provider action to cause a full-platform outage and is implementing changes to prevent recurrence.
- Planned improvements include removing the dependency on GCP for the network control plane, extending high-availability database shards across AWS and Metal, and removing Google Cloud from the data plane's hot path.