Migrating the American Express Payment Network, Twice
5 hours ago
- #Zero Downtime Migration
- #Payments Network
- #Kubernetes
- American Express successfully migrated its Payments Network twice with zero downtime, ensuring no transaction interruptions.
- The first migration involved moving from a legacy system to a new microservices-based architecture, maintaining strict operational constraints.
- Key subsystems in the migration included the Global Transaction Router (GTR) for routing and the Payments Processing Platform for business logic.
- Migration strategy was divided into three stages: Connection Migration, Shadow Traffic, and Canary Routing to minimize risk.
- The second migration focused on moving to a new Kubernetes infrastructure, leveraging infrastructure-as-code and canary routing again.
- Lessons learned emphasized the importance of traffic control, rollback capabilities, observability, shadow traffic, and infrastructure-as-code.
- The article highlights the critical role of patience and discipline in ensuring reliability during large-scale migrations.