AI in Cybersecurity: Team Shellfish's Experience in the AI Cyber Challenge

18 Aug 2025

CybersecurityAIBugDetectionSoftwareSecurity

🎬 NEW VIDEO FROM @DEFCONConference In this video, Team Shellfish, finalists of the AI Cyber Challenge, share their experience and insights on creating an autonomous AI-based system to detect and fix bugs in real software. The challenge involved identifying vulnerabilities in software where bugs had been injected and then automatically correcting them. The team was initially skeptical about the effectiveness of AI for this task but soon realized that advanced language models could play a crucial role in detecting and fixing bugs. One of the key points of the discussion is the evolution of AI models over the past two years. Initially, models like GPT 3.5 were not very effective, but over time, they have significantly improved. The team discovered that LLMs (Large Language Models) could not only identify bugs but also generate patches to fix them. They used tools like CodeQL for static analysis and developed their own orchestration framework based on YAML files and Bash scripts. An interesting technical aspect is the use of LLMs to generate fuzzers. Fuzzers are tools that test software by sending random inputs to cause crashes and identify bugs. The team found that LLMs could mutate the fuzzers to reach specific parts of the code, making bug detection more effective. They also used triage techniques to identify the root cause of bugs, using LLMs to analyze crash reports and generate patches. The team also discussed the challenges they faced, including ensuring that the patches fixed all crash cases, not just the specific ones tested. They developed systems to test the patches and ensure they couldn't be bypassed. Another challenge was getting the LLMs to generate inputs that tested the code's limits, rather than just generating normal structured inputs. In terms of practical implications, the team plans to continue developing their system and create a company to provide AI-based security solutions. They believe these technologies can be particularly useful for small businesses that lack the resources for in-depth security analyses. They also emphasized the importance of experimentation and innovation in developing these systems. In conclusion, this video offers a fascinating glimpse into the challenges and opportunities of using AI for software security. Team Shellfish demonstrated that, despite initial challenges, LLMs can play a crucial role in detecting and fixing bugs, paving the way for significant advancements in the field of cybersecurity.

AI in Cybersecurity: Team Shellfish's Experience in the AI Cyber Challenge

18 Aug 2025

CybersecurityAIBugDetectionSoftwareSecurity