The current state of LLM-driven development
15 days ago
- #LLM Limitations
- #Coding Tools
- #AI Development
- LLMs in coding workflows have no learning curve but don't magically produce production-ready code.
- LLMs struggle with code organization and perform best on mature, well-documented codebases.
- Agents in LLMs are essentially API calls with no real reflection, using tools like code navigation, file edits, and shell commands.
- Stability is a major issue with AI tools due to frequent model updates and pricing changes.
- LLMs fail at writing rare or complex code, excelling only in common, well-trodden tasks.
- Claude 4 Sonnet outperforms other models like Gemini 2.5 Pro and GPT 4.1/5 in agentic workflows.
- Local models are underpowered for coding compared to large closed models.
- GitHub Copilot is cost-effective but heavily tied to VSCode and suffers from feature bloat.
- Claude Code Pro is terminal-based and optimized for Claude 4 but lacks a good interface.
- Google's Gemini tools are buggy and poorly managed, despite the model's potential.
- AI-first IDEs like Kiro and Cursor are opaque and often don't work well.
- LLMs excel with strongly typed languages like Rust but struggle with Python without strong typing.
- Best use cases include implementing standards, writing tests, bug fixes, and processing documentation.
- LLMs add unnecessary complexity and code duplication, making developers worse over time.
- Frontend development with LLMs is problematic, especially for custom components and interactions.
- Recommendation: GitHub Copilot for its value and customizability, but LLMs aren't mandatory for all workflows.