The coming industrialisation of exploit generation with LLMs
4 months ago
- #exploit-development
- #cybersecurity
- #AI-agents
- Experiment involved building agents on Opus 4.5 and GPT-5.2 to write exploits for a QuickJS zeroday vulnerability.
- Agents successfully created over 40 distinct exploits across 6 scenarios, with GPT-5.2 solving all scenarios.
- Key conclusion: Offensive cybersecurity tasks may soon be industrialized, with token throughput becoming the limiting factor rather than human hackers.
- Exploit development is ideal for industrialization due to easy environment setup, well-understood tools, and straightforward verification.
- Challenges remain for tasks requiring real-time interaction in adversarial environments, such as lateral movement and maintaining access.
- Current models like Opus 4.5 and GPT-5.2 show promise in automating vulnerability discovery and exploit development.
- Call for frontier labs and AI Security Institutes to evaluate models against real, hard targets using zeroday vulnerabilities.
- Encouragement for researchers to experiment with high-token-budget exploitation problems and share results.