OpenAI's new open-source model is basically Phi-5

16 days ago

Copy Link

OpenAI released its first open-source large language models, gpt-oss-120b and gpt-oss-20b, with mixed performance on benchmarks.
The models excel in some areas but underperform in others, like SimpleQA, and lack out-of-domain knowledge.
Microsoft's Phi-series models, developed by Sebastien Bubeck, were trained on synthetic data, performing well on benchmarks but poorly in real-world tasks.
Synthetic data offers control over training content, making models safer but potentially less versatile.
OpenAI likely adopted synthetic data for safety, ensuring the open-source models avoid subversive behavior and align with benchmarks.
OpenAI's main business remains closed-source models, reducing the need for their open-source models to excel in real-world applications.

Hasty Briefsbeta