Cerebras Launches Qwen3-235B, Achieving 1,500 Tokens per Second

10 months ago

Cerebras Systems launched Qwen3-235B, the world's fastest frontier AI model with full 131K context support.
Qwen3-235B delivers production-grade code generation at 30x the speed and 1/10th the cost of closed-source alternatives.
The model leverages Cerebras' Wafer Scale Engine to achieve unprecedented speeds of 1,500 tokens per second.
Qwen3-235B supports 131K context length, enabling it to process large codebases and complex documents efficiently.
Cerebras partnered with Cline to integrate Qwen models into Microsoft VS Code, offering faster code generation speeds.
The model is priced at $0.60 per million input tokens and $1.20 per million output tokens, significantly lower than competitors.
Cerebras' solution avoids the complexity of distributed computing, making it easier to deploy large AI models.

Hasty Briefsbeta