Hasty Briefsbeta

Bilingual

OpenAI's new open weight (Apache 2) models are good

9 months ago
  • #AI
  • #Machine Learning
  • #OpenAI
  • OpenAI released new open weight models under Apache 2.0 license: gpt-oss-120b and gpt-oss-20b.
  • gpt-oss-120b achieves near-parity with proprietary o4-mini on reasoning benchmarks, running on an 80GB GPU.
  • gpt-oss-20b matches o3-mini performance, suitable for edge devices with 16GB memory.
  • Both models use mixture-of-experts, activating 5.1B and 3.6B parameters per token respectively.
  • Models perform well on PhD-level science questions (GPQA Diamond benchmark).
  • gpt-oss-20b runs efficiently on a Mac with 32GB RAM, using ~12GB for inference.
  • Models support reasoning levels (low, medium, high) affecting speed and accuracy.
  • OpenAI Harmony introduced as a new prompt template format with system, developer, user, assistant, and tool roles.
  • Models trained on trillions of tokens, focusing on STEM, coding, and general knowledge, with safety filters.
  • Training costs estimated between $4.2M-$23.1M for gpt-oss-120b and $420K-$2.3M for gpt-oss-20b.
  • Models support tool calling for web browsing, Python execution, and developer-defined functions.
  • Competitive with recent Chinese open models (Qwen, Moonshot, Z.ai), potentially surpassing them.