Simon Willison's Lethal Trifecta Talk at the Bay Area AI Security Meetup

15 days ago

Copy Link

Talk on prompt injection and the lethal trifecta at the Bay Area AI Security Meetup.
Prompt injection explained as a security vulnerability similar to SQL injection, caused by string concatenation in AI systems.
Example of a translation app vulnerability where user input can override system instructions.
Risks of prompt injection in sensitive systems, like digital assistants handling emails.
Markdown exfiltration attack described, where private data is leaked via image rendering.
List of systems affected by prompt injection attacks, including ChatGPT, Google Bard, and Microsoft Copilot.
Discussion on the challenges of coining new terms in tech, with examples like 'prompt injection' and 'lethal trifecta'.
The lethal trifecta defined as a combination of private data access, untrusted content, and external communication in AI systems.
Common ineffective protections against prompt injection, such as 'prompt begging' and AI-based detection layers.
Importance of removing at least one leg of the lethal trifecta to prevent attacks.
Critique of MCP (Model Context Protocol) for outsourcing security decisions to users.
References to papers and articles on securing LLM agents and the lethal trifecta.

Hasty Briefsbeta