Nvidia with unusually fast coding model on plate-sized chips
12 days ago
- #AI
- #Coding
- #OpenAI
- OpenAI released GPT-5.3-Codex-Spark, its first production AI model running on non-Nvidia hardware (Cerebras chips).
- The model delivers code at over 1,000 tokens per second, 15x faster than its predecessor.
- Anthropic’s Claude Opus 4.6 in fast mode reaches 2.5x its standard speed (68.2 tokens per second).
- Codex-Spark is available to ChatGPT Pro subscribers ($200/month) via Codex app, CLI, and VS Code extension.
- Features a 128,000-token context window and is text-only at launch.
- Tuned for speed over depth, optimized for coding, not general-purpose tasks like GPT-5.3.
- Outperforms GPT-5.1-Codex-mini on SWE-Bench Pro and Terminal-Bench 2.0 but lacks independent validation.
- Previously, Codex was slower than Claude Code in tests (e.g., Minesweeper clone creation).
- GPT-5.3-Codex-Spark’s speed (1,000 tokens/sec) surpasses OpenAI’s Nvidia-based models (GPT-4o: ~147 tokens/sec).