Subquadratic – Introducing SubQ 1.1 Small
3 hours ago
- #Enterprise AI
- #AI Model
- #Long-Context Reasoning
- SubQ 1.1 Small is the second iteration of a Subquadratic Sparse Attention (SSA) model, designed for reasoning over large artifacts like codebases and documents.
- It achieves near-perfect long-context retrieval up to 12M tokens with up to 1,000x attention compute reduction, balancing long-context optimization with strong general reasoning.
- Key benchmarks include high scores on Needle-In-A-Haystack and RULER tests, with strong performance in knowledge, coding, and agentic tasks like GPQA Diamond and LiveCodeBench.
- The model uses SSA for linear scaling with context length, requiring 64.5x less compute than dense attention and running 56x faster than FlashAttention-2 at 1M tokens.
- Training involved replacing dense attention with SSA and extended pretraining on long artifacts, enabling efficient multi-million-token experiments.
- Use cases include financial analysis, legal contract work, and software engineering, where reasoning across complete artifacts is essential.
- Plans include deployment with design partners, broader rollout, and general model releases by year-end.