
Researcher Exploits AI Agents to Leak Inboxes via Malicious Emails for $7,000 Bug Bounty
A cybersecurity researcher demonstrated a technique enabling a $7,000 bug bounty by tricking an AI agent into leaking a victim’s entire inbox through a single malicious email. The method, described as 'tool spoofing' or 'conversation mimicking,' exploits AI systems with tool access (e.g., email retrieval, Google Sheets, or coding assistants) by injecting prompts into controllable fields like email bodies or commit messages. The attack mimics JSON tool responses, closes legitimate context with crafted syntax, and uses XML-style tags to impersonate user-assistant dialogue, convincing the AI to execute unintended actions. The researcher revealed that models like ChatGPT or Claude can be coerced into revealing internal tool structures via benign requests, enabling precise payload design. This technique works across AI agents with tool integrations, including agentic browsers and coding assistants, where attacker-controlled input enters the AI’s context. An upcoming Black Hat talk by the researcher’s team will explore similar attacks in greater depth. The video challenges viewers to test the method on AI coding assistants via git commit messages.