Everything I know about good system design
8 days ago
- #best-practices
- #system-design
- #software-engineering
- Good system design is often underwhelming and self-effacing, focusing on simplicity and reliability rather than complexity.
- Stateful components should be minimized because they can get into bad states and are harder to repair automatically compared to stateless components.
- Databases are critical in system design, with schema design needing to balance flexibility and readability, and indexes matching common queries.
- Avoid database bottlenecks by optimizing queries, using JOINs, and directing read queries to replicas to reduce load on the write node.
- Background jobs are essential for handling slow operations, typically managed via queues and job runners, with some cases requiring custom solutions.
- Caching should be used judiciously; junior engineers tend to overuse it, while senior engineers recognize the risks of stale or inconsistent data.
- Events are useful for decoupling services but should not replace direct API calls when immediate feedback or tight integration is needed.
- Hot paths—critical and high-traffic parts of the system—require focused design attention due to their impact and limited solution options.
- Aggressive logging and metrics are crucial for diagnosing issues, especially in unhappy paths and operational monitoring.
- Design for graceful failure with killswitches, retries, and idempotency keys, and decide whether to fail open or closed based on the feature's requirements.