OpenAI's new open weight (Apache 2) models are good

10 months ago

OpenAI released new open weight models under Apache 2.0 license: gpt-oss-120b and gpt-oss-20b.
gpt-oss-120b achieves near-parity with proprietary o4-mini on reasoning benchmarks, running on an 80GB GPU.
gpt-oss-20b matches o3-mini performance, suitable for edge devices with 16GB memory.
Both models use mixture-of-experts, activating 5.1B and 3.6B parameters per token respectively.
Models perform well on PhD-level science questions (GPQA Diamond benchmark).
gpt-oss-20b runs efficiently on a Mac with 32GB RAM, using ~12GB for inference.
Models support reasoning levels (low, medium, high) affecting speed and accuracy.
OpenAI Harmony introduced as a new prompt template format with system, developer, user, assistant, and tool roles.
Models trained on trillions of tokens, focusing on STEM, coding, and general knowledge, with safety filters.
Training costs estimated between $4.2M-$23.1M for gpt-oss-120b and $420K-$2.3M for gpt-oss-20b.
Models support tool calling for web browsing, Python execution, and developer-defined functions.
Competitive with recent Chinese open models (Qwen, Moonshot, Z.ai), potentially surpassing them.

Hasty Briefsbeta