The coming industrialisation of exploit generation with LLMs

4 months ago

Experiment involved building agents on Opus 4.5 and GPT-5.2 to write exploits for a QuickJS zeroday vulnerability.
Agents successfully created over 40 distinct exploits across 6 scenarios, with GPT-5.2 solving all scenarios.
Key conclusion: Offensive cybersecurity tasks may soon be industrialized, with token throughput becoming the limiting factor rather than human hackers.
Exploit development is ideal for industrialization due to easy environment setup, well-understood tools, and straightforward verification.
Challenges remain for tasks requiring real-time interaction in adversarial environments, such as lateral movement and maintaining access.
Current models like Opus 4.5 and GPT-5.2 show promise in automating vulnerability discovery and exploit development.
Call for frontier labs and AI Security Institutes to evaluate models against real, hard targets using zeroday vulnerabilities.
Encouragement for researchers to experiment with high-token-budget exploitation problems and share results.

Hasty Briefsbeta