Hasty Briefsbeta

Bilingual

GPT-5.5 matches hyped Mythos Preview

7 hours ago
  • #Cybersecurity
  • #Benchmarking
  • #AI Models
  • Anthropic restricted the release of its Mythos Preview model to 'critical industry partners' citing cybersecurity threats.
  • AISI research indicates OpenAI's GPT-5.5 performs similarly to Mythos Preview on cybersecurity evaluations, including expert tasks and complex challenges.
  • In expert Capture the Flag tasks, GPT-5.5 averaged 71.4% success versus Mythos Preview's 68.6%, with GPT-5.5 solving a difficult Rust disassembler challenge in 10 minutes.
  • Both GPT-5.5 and Mythos Preview succeeded in AISI's TLO test simulating data extraction attacks, where no previous model had succeeded.
  • GPT-5.5 and other models fail at AISI's more difficult 'Cooling Tower' simulation, which involves disrupting power plant control software.