A human postmortem of the 1996 AOL outage
a day ago
- #Internet History
- #Economic Inequality
- #Site Reliability
- The August 1996 AOL outage lasted 19 hours and became a major news story, highlighting the internet's growing importance in daily life and public awareness of reliability issues.
- Technical causes of AOL outages included maintenance failures and power issues, but the focus shifted to human impacts, such as disrupted product launches, personal boredom, and potential life-altering consequences for individuals relying on online services.
- Site reliability engineering (SRE) often prioritizes technical details over human stories, but outages affect people unevenly, with marginalized groups bearing disproportionate costs, revealing economic inequalities in technology access and recovery.
- Economic pressures, like cost-cutting and enshittification, can undermine reliability, especially when monopolistic practices reduce competition and switching costs for users, making reliability less financially incentivized.
- Proposed solutions include incorporating victim impact statements to highlight personal outage effects, outsourcing research to academia for deeper sociotechnical analysis, and SREs advocating for reliability as a moral imperative beyond profit motives.