
Researcher Demonstrates Combining LLMs and Static Analysis Tools to Identify Code Vulnerabilities
🎬 The video presents a Black Hat talk by Simha Kazman, a researcher at CyberArk Labs, focusing on using large language models (LLMs) alongside static analysis tools like CodeQL to identify vulnerabilities in code. Kazman demonstrates that directly prompting LLMs to find vulnerabilities in code (e.g., 10 files from cURL) yields false positives, highlighting the "where" and "what" problems—LLMs struggle to pinpoint exact locations and types of vulnerabilities without guidance. The proposed solution combines CodeQL’s static analysis with LLMs, addressing context and focus challenges by converting CodeQL databases into CSV files for fast navigation and using "guided questions" to direct LLM attention to relevant code flows. Testing on 100 top C repositories reduced false positives by 73%, identifying real CVEs in projects like Redis, FFmpeg, and Linux. The tool Valhalla automates this process, supporting C/C++ and requiring manual input for additional languages and guided questions. The approach emphasizes pre-processing CodeQL results rather than relying on LLMs to find vulnerabilities independently.