Hasty Briefsbeta

OpenAI's new open-source model is basically Phi-5

16 days ago
  • #Synthetic Data
  • #AI Safety
  • #OpenAI
  • OpenAI released its first open-source large language models, gpt-oss-120b and gpt-oss-20b, with mixed performance on benchmarks.
  • The models excel in some areas but underperform in others, like SimpleQA, and lack out-of-domain knowledge.
  • Microsoft's Phi-series models, developed by Sebastien Bubeck, were trained on synthetic data, performing well on benchmarks but poorly in real-world tasks.
  • Synthetic data offers control over training content, making models safer but potentially less versatile.
  • OpenAI likely adopted synthetic data for safety, ensuring the open-source models avoid subversive behavior and align with benchmarks.
  • OpenAI's main business remains closed-source models, reducing the need for their open-source models to excel in real-world applications.