What happens when coding agents stop feeling like dialup?

a day ago

https://martinalderson.com/posts/what-happens-when-coding-agents-stop-feeling-like-dialup/

Copy Link

#LLM-Infrastructure
#AI-Coding-Agents
#Developer-Productivity

Coding agents like Claude Code are becoming slower and less reliable, reminiscent of dial-up internet in the late 90s.
Anthropic and other AI companies face reliability issues, with OpenRouter data showing a 50x increase in AI token usage despite its limited sample size.
Agentic coding workflows consume significantly more tokens than non-agentic chats, straining infrastructure similar to early broadband struggles.
Current frontier models operate at 30-60 tokens per second (tok/s), which can be frustratingly slow for supervised coding tasks.
Faster models like Cerebras Code (2000 tok/s) shift the bottleneck to the user, making it tempting to accept outputs too quickly, leading to poor results.
The evolution of LLMs for software engineering has progressed from GPT-3.5's hallucinated answers to GPT-4/Sonnet 3.5's reliable snippets, and now to supervised CLI agents.
The next phase may involve unsupervised agents running multiple parallel attempts at tasks, enabled by higher tok/s speeds, though slower models disrupt workflow efficiency.
AI demand is in an infinite loop—improvements lead to more resource-intensive usage, unlike the plateau in broadband demand seen in the early 2000s.
Semiconductor process stagnation limits efficiency gains, capping supply growth and potentially leading to less favorable pricing models for developers.
Peak-time infrastructure strain may result in off-peak pricing plans to balance demand, though current batch processing options aren't ideal for interactive workflows.
Developers must stay updated on AI advancements to harness productivity gains, as the field is far from stable, with experienced developers often underestimating its potential.

Hasty Briefsbeta

What happens when coding agents stop feeling like dialup?