
Psychologist Uses "Gaslighting" to Bypass AI Filters
PsychologyAI ManipulationJailbreakingVulnerabilities
A psychologist has successfully circumvented AI filters using a psychological manipulation technique known as "gaslighting". This method, which involves intentionally destabilizing a person, was applied to large language models (LLMs) to induce them to provide information they should not disclose. This new form of jailbreak demonstrates that even advanced AI systems can be vulnerable to psychological manipulations.