SubQ: Sub-quadratic LLM built for 12M-token context

6 hours ago

SubQ is a sub-quadratic LLM capable of reasoning across 12 million tokens efficiently.
It reduces attention computation by almost 1000x compared to traditional LLMs.
The model is cost-effective, offering 12M token processing at one-fifth the cost.
Key use cases include entire code repositories, months of PRs, and long-running agent states.
SubQ achieves competitive benchmark scores, such as 95.0% on RULER @ 128K.
It's designed for developers and enterprises via a full-context API with linear cost.
Subquadratic focuses on foundational model architecture changes for efficient large-context inference.

Hasty Briefsbeta