The AWS Service Quotas That Will Take Down Your Production at 3 Am
3 hours ago
- #Service Limits
- #Production Outage
- #AWS
- AWS limits fall into two categories: adjustable quotas and hard limits.
- Adjustable quotas (e.g., Lambda concurrent executions, EC2 vCPUs) can be raised via support but may take 1-3 business days or longer, making them inadequate for immediate traffic spikes.
- Hard limits (e.g., NAT Gateway connections, S3 request rates, DynamoDB on-demand tables) are fixed and cannot be increased via support, posing architectural challenges.
- A 3 AM production outage due to hitting a Lambda concurrent execution limit illustrates the risk of relying on support for rapid quota increases.
- Understanding the distinction between these limits is crucial to avoid downtime and may require architectural redesigns instead of support tickets.