Hasty Briefsbeta

Bilingual

Cerebras now supports OpenAI GPT-OSS-120B at 3k Tokens Per SEC

9 months ago
  • #Cerebras
  • #AI
  • #OpenAI
  • Cerebras Systems announced inference support for OpenAI's gpt-oss-120B model, achieving record-breaking speeds of 3,000 tokens per second.
  • The gpt-oss-120B model offers performance comparable to top proprietary models like Gemini 2.5 Flash and Claude Opus 4, with added speed, cost efficiency, and openness.
  • Cerebras' wafer-scale AI infrastructure eliminates GPU bottlenecks, enabling full-model inference at unprecedented speeds.
  • Developers can easily switch to Cerebras-powered gpt-oss-120B with no refactoring, gaining instant access to high-performance AI.
  • OpenAI's Apache 2.0 license allows users to fine-tune, deploy on-prem, or move across clouds freely.
  • Cerebras Cloud offers free API access to gpt-oss-120B, enabling live coding assistants, document Q&A, and fast research chains.