SubQ: Sub-quadratic LLM built for 12M-token context
7 hours ago
- #long-context
- #LLM
- #sub-quadratic
- SubQ is a sub-quadratic LLM capable of reasoning across 12 million tokens efficiently.
- It reduces attention computation by almost 1000x compared to traditional LLMs.
- The model is cost-effective, offering 12M token processing at one-fifth the cost.
- Key use cases include entire code repositories, months of PRs, and long-running agent states.
- SubQ achieves competitive benchmark scores, such as 95.0% on RULER @ 128K.
- It's designed for developers and enterprises via a full-context API with linear cost.
- Subquadratic focuses on foundational model architecture changes for efficient large-context inference.