Lethal Trifecta | Model Monster

A security vulnerability pattern identified by Simon Willison occurring when an AI agent simultaneously possesses three capabilities: (1) access to private or sensitive data, (2) exposure to untrusted content, and (3) the ability to communicate externally. When all three capabilities are present, prompt injection attacks can cause the agent to access private data and transmit it to an attacker. The Lethal Trifecta has been demonstrated against major products including Microsoft 365 Copilot, ChatGPT, Google Gemini, Slack, and GitHub Copilot. Because prompt injection remains an unsolved problem, the primary defense is to ensure AI systems never combine all three capabilities simultaneously.