Actual LLM agents are coming

17 days ago

Copy Link

OpenAI released DeepResearch, a specialized variant for web and document search, capable of planning search strategies and cross-referencing sources.
Claude Sonnet 3.7 applies similar advancements to code, outperforming past models on complex programming tasks.
Anthropic defines LLM agents as systems where LLMs dynamically direct their own processes and tool usage.
Common agentic systems use predefined code paths, leading to limitations like inability to plan, memorize, or act effectively long-term.
The 'bitter lesson' suggests that hardcoding knowledge into models is suboptimal; scaling computation through search and learning is better.
LLM agents are trained with reinforcement learning, using verifiers to check rewards, and often require drafts and multi-step training.
Training LLM agents involves generating large amounts of data through emulations or simulations, similar to game RL.
Actual LLM agents can automate complex processes like search, network engineering, and financial tasks without predefined prompts.
Big labs currently dominate LLM agent development due to their resources, but democratizing training and deployment is critical for broader adoption.

Hasty Briefsbeta