Obituary: Farewell to robots.txt (1994-2025)
3 days ago
- #AI-crawlers
- #internet-history
- #robots.txt
- robots.txt, the voluntary compliance protocol for web crawlers, has officially ended in July 2025 after 30 years of service.
- Originally created in 1994 to prevent server crashes from faulty crawlers, robots.txt was widely respected by search engines like Google and Yahoo.
- AI corporations systematically disregarded robots.txt, leading to its demise, with Cloudflare's default blocking of AI crawlers marking the final blow.
- The internet's shift from a collaborative space to an extraction zone for AI training data broke the traditional crawl-to-referral ratio, with AI crawlers generating massive traffic without returning value.
- OpenAI's GPTBot, Anthropic's ClaudeBot, and Perplexity AI were among the worst offenders, ignoring robots.txt directives and overwhelming servers.
- Efforts to save robots.txt, including legal actions and alternative protocols like ai.txt, failed to restore voluntary compliance.
- The European Data Protection Board and Italy's Garante attempted to enforce legal consequences, but the system had already collapsed.