Eight more months of agents

2 days ago

https://crawshaw.io/blog/eight-more-months-of-agents

Copy Link

#AI
#LLM
#Programming

The author reflects on the rapid evolution of LLM-assisted programming over the past year, noting significant improvements in model capabilities, particularly in coding tasks.
Agent harnesses have seen little improvement, with some capabilities from six months ago still unmatched by current popular agents.
Public benchmarks for models are considered unreliable due to gaming; qualitative improvements in coding models are highlighted as significant economic signals.
The shift in the author's coding time allocation from writing to reading code is noted, with a current ratio of 95-5 in favor of reading.
The history and current state of IDEs are discussed, with the author moving away from IDEs back to Vi, despite earlier predictions about the dominance of IDEs with LLM-assisted features.
The importance of using the best models (like Opus or GPT-7.9) is emphasized, despite the cost, to truly understand their capabilities.
The challenges of working with agents, including the need for fresh VMs to avoid sandbox limitations, are mentioned.
The author is building exe.dev to address the need for unconstrained agents in easily accessible VMs.
The joy and increased productivity brought by agents in programming are contrasted with broader societal fears about AI.
A programming philosophy is introduced: the best software for an agent is what's best for a programmer, reversing traditional product development wisdom.
The author's experience with Stripe Sigma illustrates how agents can outperform traditional products by enabling custom solutions with minimal input.

Hasty Briefsbeta

Eight more months of agents