Critical LMDeploy Vulnerability (CVE-2026-33626) Exploited Within 13 Hours of Disclosure

0
16

Key Takeaways

  • A critical Server-Side Request Forgery (SSRF) vulnerability (CVE-2026-33626, CVSS 7.5) in LMDeploy’s vision-language module allows attackers to fetch arbitrary URLs without IP validation, enabling access to internal services like AWS IMDS, databases, and administrative interfaces.
  • Threat actors exploited this flaw in the wild within 12 hours and 31 minutes of its public GitHub disclosure, demonstrating extremely rapid weaponization before patches could be applied.
  • During an eight-minute attack session, adversaries used the vulnerability as an SSRF primitive to perform internal network port-scanning (targeting 127.0.0.1), confirm egress via OOB DNS, and enumerate internal services like Redis and MySQL, leveraging multiple VLMs to evade detection.
  • This incident reflects a growing trend where AI infrastructure vulnerabilities are actively hunted and exploited within hours of disclosure, with detailed advisories effectively serving as exploit blueprints for LLMs.
  • Concurrent threats include active exploitation of high-severity flaws in WordPress plugins (Ninja Forms File Upload, Breeze Cache) and a global campaign targeting internet-exposed Modbus PLCs, highlighting broad vulnerability exploitation across diverse systems.

Vulnerability Details in LMDeploy
A high-severity security flaw, identified as CVE-2026-33626 with a CVSS score of 7.5 (rated High), was discovered in LMDeploy, a widely used open-source toolkit for compressing, deploying, and serving Large Language Models (LLMs). The vulnerability specifically resides in the vision-language module’s load_image() function located in lmdeploy/vl/utils.py. Project maintainers confirmed in their advisory that this function fetches arbitrary URLs provided by users or external inputs without performing adequate validation to block requests targeting internal or private IP address ranges (such as 127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16). This lack of SSRF protection allows malicious actors to redirect the server’s outbound requests toward sensitive internal resources that are not intended for public internet access.

Rapid Exploitation Timeline and Initial Discovery
The vulnerability was publicly disclosed via a GitHub advisory, and remarkably, active exploitation was observed in the wild less than 13 hours later. Cloud security firm Sysdig detected the first exploitation attempt against its honeypot systems designed to mimic vulnerable LMDeploy deployments. According to their analysis published this week, the malicious activity originated from the IP address 103.116.72[.]119 and was detected on April 22, 2026, at 03:35 a.m. UTC – precisely 12 hours and 31 minutes after the vulnerability details were made public. This rapid transition from disclosure to active attack underscores the aggressive tactics employed by threat actors targeting newly revealed weaknesses in AI-related software infrastructure.

Attacker Techniques and Multi-Phase Exploitation
Sysdig’s forensic analysis revealed that the attacker did not merely confirm the vulnerability’s existence but actively exploited it as a versatile SSRF weapon during a concentrated eight-minute session. The exploitation unfolded across three distinct phases, involving a total of 10 separate HTTP requests. To avoid triggering simple anomaly detection based on repeated requests to a single target, the adversary strategically switched between different Vision Language Models (VLMs) – specifically utilizing internlm-xcomposer2 and OpenGVLab/InternVL2-8B – when making requests via the vulnerable load_image() function. In the initial phase, the attacker targeted high-value internal services: first probing for the AWS Instance Metadata Service (IMDS) at http://169.254.169.254/latest/meta-data/ to potentially steal cloud credentials, and then attempting to access Redis and MySQL services running on the host or internal network. The second phase involved verifying outbound connectivity: the attacker sent a request to an out-of-band (OOB) DNS monitoring domain (requestrepo[.]com) to confirm that the SSRF vulnerability could successfully reach arbitrary external hosts, a critical step for enabling data exfiltration or command-and-control (C2) channels. The final phase focused on internal network reconnaissance, where the attacker performed a port scan of the loopback interface (127.0.0.1), systematically probing common ports to map exposed internal services like secondary HTTP administrative interfaces or other management tools, thereby identifying lateral movement opportunities within the compromised environment.

Implications and the Broader Pattern of Rapid Exploitation
The Lazarus-like speed of this exploitation – occurring well before most organizations could realistically assess, patch, or mitigate the newly disclosed flaw – serves as a stark reminder of the current threat landscape. Sysdig researchers explicitly connected this incident to a concerning pattern observed over the preceding six months within the AI infrastructure sector. They noted that critical vulnerabilities affecting core components like LLM inference servers, model gateways, and agent orchestration tools are consistently being weaponized within hours of advisory publication, irrespective of the software’s actual user base size or perceived obscurity. This trend is significantly amplified by the nature of modern vulnerability disclosures in the AI/GenAI space. Advisories such as the one for CVE-2026-33626, which precisely detail the affected file (lmdeploy/vl/utils.py), the vulnerable parameter (the URL input to load_image()), the root cause (missing IP validation), and even provide sample vulnerable code, function almost like a ready-made input prompt. Such specificity allows attackers, potentially leveraging commercial LLMs themselves, to rapidly generate functional exploit code or SSRF payloads tailored to the vulnerability, drastically reducing the time and skill required to move from disclosure to active attack.

Concurrent Threats: WordPress Plugins and Modbus PLCs
The LMDeploy exploitation incident did not occur in isolation but coincided with other active exploitation campaigns targeting different classes of software and hardware. Threat actors were simultaneously observed exploiting two critical vulnerabilities in popular WordPress plugins. The first, CVE-2026-0740 (CVSS 9.8) in the Ninja Forms – File Upload plugin, and the second, CVE-2026-3844 (CVSS 9.8) in the Breeze Cache plugin, both allow unauthenticated attackers to upload arbitrary files to vulnerable WordPress sites. Successful exploitation of these flaws can lead to remote code execution, enabling complete compromise of the affected websites, data theft, or use of the site as a platform for further attacks. Separately, a prolonged global threat campaign was attributed to unknown actors targeting internet-exposed Modbus-enabled Programmable Logic Controllers (PLCs). This campaign, active from September to November 2025, scanned and attempted to interact with a staggering 14,426 distinct IP addresses across 70 countries. The highest concentrations of targeted IPs were found in the United States, France, Japan, Canada, and India. Analysis by Cato Networks researchers indicated that the activity combined broad, automated scanning for initial access with more focused, sequential probing suggestive of deliberate device fingerprinting. This latter behavior implies efforts to understand specific PLC models and configurations, potentially laying the groundwork for disruption, manipulation, or sabotage of industrial control systems should the devices be accessible from the public internet. Notably, a subset of the malicious traffic in this PLC campaign was traced back to IP addresses geolocated to China, though the researchers emphasized the use of low-reputation or rotating scanning infrastructure, complicating definitive attribution.

Conclusion: Urgency in AI Infrastructure Security
The sequence of events surrounding CVE-2026-33626 – from swift disclosure to near-instantaneous active exploitation involving cloud credential theft, internal network mapping, and lateral movement attempts – highlights a critical inflection point in cybersecurity defense. It demonstrates that the window between vulnerability awareness and effective protection is narrowing to mere hours, particularly within the fast-evolving and high-value domain of generative AI infrastructure. Organizations deploying tools like LMDeploy must prioritize not only rapid patching but also implement overlapping defense-in-depth strategies, including strict network segmentation (blocking unnecessary egress to metadata services like IMDS), rigorous input validation and URL scanning at the application layer, continuous monitoring for anomalous internal connection attempts, and leveraging threat intelligence feeds that provide near-real-time alerts on newly disclosed exploits. The reality is clear: in the age of AI, vulnerability details are no longer just technical reports; they are actionable intelligence for adversaries, demanding equally swift and sophisticated defensive responses. The exploitation of LMDeploy, alongside threats to WordPress and critical industrial systems, reinforces that comprehensive vulnerability management must span the entire digital ecosystem, from cloud-native AI tools to legacy web plugins and operational technology.

SignUpSignUp form

LEAVE A REPLY

Please enter your comment!
Please enter your name here