Chroma Context-1: Training a Self-Editing Search Agent

12 hours ago

Introduction of Context-1, a 20B parameter agentic search model for retrieval-augmented-generation (RAG).
Context-1 addresses limitations of single-stage retrieval by enabling multi-hop retrieval through iterative query decomposition and evidence gathering.
The model is trained on synthetic tasks, focusing on planning and evaluation skills, with a curriculum shifting from recall to precision.
Context-1 features self-editing context management, allowing selective retention or discarding of retrieved documents to manage context window size.
Performance benchmarks show Context-1 matches or exceeds frontier models in retrieval tasks across web, finance, legal, and email domains.
The model is released as open weights, along with the synthetic task generation pipeline, to support reproducibility and further research.
Future directions include expanding task diversity, improving tool use and search infrastructure, and enhancing context management strategies.

Hasty Briefsbeta