Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems
5 hours ago
- #AI Security
- #Injection Attacks
- #LLM Vulnerabilities
- Standard injection detectors for LLM agents are based on static, template-based payloads.
- Domain-camouflaged injection attacks mimic target document vocabulary and authority structures, evading detection.
- Detection rates dropped dramatically: from 93.8% to 9.7% for Llama 3.1 8B, and from 100% to 55.6% for Gemini 2.0 Flash.
- The Camouflage Detection Gap (CDG) quantifies this vulnerability, showing significant differences across 45 tasks.
- Llama Guard 3 detected zero camouflage payloads, highlighting a systemic blind spot.
- Multi-agent debate architectures amplified static attacks up to 9.9x on smaller models, while stronger models resisted.
- Targeted detector augmentation provided only partial remediation, suggesting an architectural vulnerability.
- The research framework, task bank, and payload generator were released publicly.