The RAG Obituary: Killed by agents, buried by context windows

15 hours ago

Copy Link

The article discusses the decline of Retrieval-Augmented Generation (RAG) systems due to advancements in context windows and agent-based architectures.
RAG was initially developed to handle large knowledge bases that exceeded the token limits of early LLMs like GPT-3.5 and GPT-4.
Chunking documents for RAG presents challenges, such as fragmenting critical information and disrupting document structure.
Embedding and retrieval pipelines in RAG systems often struggle with semantic search, especially with numerical data and specific terminologies.
Hybrid search, combining semantic and keyword search, improves retrieval but adds complexity and latency.
Reranking in RAG systems introduces additional costs, latency, and infrastructure burdens.
RAG systems have fundamental limitations, including context fragmentation, semantic search failures, and lack of causal understanding.
The emergence of agentic search, exemplified by Claude Code, offers a new paradigm that eliminates the need for RAG by leveraging large context windows and intelligent navigation.
Agentic search uses tools like Grep and Glob for direct filesystem access, enabling real-time, precise, and context-rich document analysis.
The future of AI search lies in agentic systems that can navigate, reason, and understand relationships across entire documents without retrieval fragmentation.

Hasty Briefsbeta