
Broken by Default: Study Finds LLM-Generated C/C++ Code Highly Vulnerable
LLMC/C++vulnerabilitiescodesecurityformalverificationGPT-4oClaudeGeminiLlamaMistralCodeQLSemgrepZ3SMT
A researcher used Z3 SMT formal verification to analyze 3,500 code samples generated by GPT-4o, Claude, Gemini, Llama, and Mistral. The study found that 55.8% of the code contained at least one proven vulnerability, with 1,055 concrete exploitation cases identified. GPT-4o performed the worst at 62.4% vulnerability rate, while no model scored below 48%. Additionally, six industry tools (including CodeQL and Semgrep) missed 97.8% of these vulnerabilities.