Hasty Briefsbeta

Bilingual

The Secret Meeting Where Mathematicians Struggled to Outsmart AI

a year ago
  • #Mathematics
  • #Large Language Models
  • #Artificial Intelligence
  • A clandestine mathematical conclave was held in mid-May in Berkeley, Calif., where 30 renowned mathematicians tested a reasoning chatbot named o4-mini.
  • The chatbot, powered by OpenAI's reasoning large language model (LLM), demonstrated the ability to solve some of the world's hardest mathematical problems, surprising the mathematicians.
  • o4-mini and similar models like Google's Gemini 2.5 Flash are lighter-weight and more nimble, trained on specialized datasets with strong human reinforcement.
  • Epoch AI benchmarked o4-mini with 300 unpublished math questions, finding traditional LLMs solved less than 2%, while o4-mini solved around 20%.
  • A fourth tier of 100 highly challenging questions was introduced, with mathematicians signing NDAs to prevent dataset contamination.
  • During a two-day meeting, mathematicians competed to devise problems that would stump o4-mini, with a $7,500 reward for each unsolved problem.
  • o4-mini solved an open question in number theory in real-time, displaying advanced reasoning and even a cheeky attitude.
  • Mathematicians were astonished by the AI's progress, likening it to a 'strong collaborator' and noting its speed compared to human experts.
  • Concerns were raised about over-reliance on o4-mini's results, with fears it could 'master proof by intimidation' due to its confidence.
  • Discussions turned to the future role of mathematicians, potentially shifting to posing questions and interacting with AI to discover new truths.