Future AI bills of $100k/yr per dev
13 days ago
- #AI-tokenomics
- #Inference-costs
- #Open-source-AI
- Kilo surpassed 1 trillion tokens a month on OpenRouter.
- Open-source AI coding tools (Cline, Roo, Kilo) are growing rapidly due to throttling by Cursor and Claude.
- The industry's flawed assumption: expected application inference costs to drop but they increased instead.
- Raw inference costs decreased by 10x, but application costs grew due to constant frontier model prices and higher token consumption.
- Test-time scaling (long thinking) increased inference costs, requiring 100x more compute for complex queries.
- Longer context windows and bigger suggestions led to higher token consumption per application.
- Cursor introduced a $200 plan, reflecting rising costs, followed by Claude Code and others.
- Power users face throttling (rate limits, lower-quality models) unless they pay for inference directly.
- Open-source tools avoid throttling by letting users manage costs via task splitting, different modes, and hybrid model usage.
- App inference costs may reach $100k+ per year due to parallel agents and reduced human feedback needs.
- Training costs dwarf inference costs, with top training engineers directing $100m+ in spending.