
New Attack Method Bypasses LLM Defenses
AIsecurityLLMattackresearch
A researcher published findings on a new attack method that manipulates large language models (LLMs) by embedding ordinary language in prior context, altering their decision-making without adversarial signatures or explicit commands. The attack bypasses existing defenses, as it relies on framing disguised as facts rather than detectable payloads. Tests across four frontier models showed binary decision reversals, and the effect persists through agentic pipelines, surviving summarization. The paper and demos are available, with coordinated disclosure to major AI providers and security organizations.