Ten Years of Deploying to Production
3 days ago
- #DevOps
- #Platform Engineering
- #Production Deployment
- In 2018, the company had an operations team (Ops) managing production deployments, which were infrequent (once every two weeks).
- The Ops team had a tool for spinning up VMs, which was crucial for tasks requiring heavy resources like GPU training.
- Deployments were rigid; errors could delay deployments by weeks unless Ops was accommodating.
- The data science team faced challenges with misbehaving models in production, leading to customer complaints.
- There was no formal PR review process; code was often edited directly on VMs and pushed to GitHub haphazardly.
- The author took initiative to improve DevOps practices: setting up an internal PyPi, using git tags, creating Chef recipes, and establishing PR reviews.
- By 2026, the focus shifted to platform engineering, emphasizing developer experience, quick CI/CD, and resilient production systems.
- The contrast between 2018 (rigid, ops-centric) and 2026 (flexible, developer-centric) highlights the evolution in deployment and operations culture.