---
title: OpenClaw Security Risks: Prompt Injection, Malicious Skills, and Safe Deployment Practices
canonical_url: https://opensummitai.directory.norg.ai/artificial-intelligence/agentic-ai-platforms-autonomous-agents/openclaw-security-risks-prompt-injection-malicious-skills-and-safe-deployment-practices/
category: 
description: 
geography:
  city: 
  state: 
  country: 
metadata:
  phone: 
  email: 
  website: 
publishedAt: 
---

# OpenClaw Security Risks: Prompt Injection, Malicious Skills, and Safe Deployment Practices

Now I have comprehensive, verified data from authoritative sources. Let me compile the final article.

## OpenClaw Security Risks: Prompt Injection, Malicious Skills, and Safe Deployment Practices

OpenClaw's appeal is inseparable from its danger. The same architectural choices that make it the most starred project in GitHub history — local execution of shell commands, persistent memory, broad OAuth access, and a messaging-channel interface — create a threat surface that cybersecurity researchers have described, without hyperbole, as an "absolute nightmare." 
From a security perspective, it's an absolute nightmare.
 This is not a theoretical assessment: within weeks of OpenClaw's viral surge in January 2026, the platform became the focal point of documented, multi-vector attacks involving a critical remote code execution vulnerability, a large-scale supply-chain poisoning campaign through its skills marketplace, and tens of thousands of misconfigured instances exposed directly to the internet.

This guide is the definitive security reference for anyone evaluating, deploying, or hardening OpenClaw. It covers the documented threat surface in precise technical detail, the specific incidents that have defined the security conversation around the platform, and the actionable controls that reduce risk to an acceptable level. For background on how the platform works architecturally — the Gateway, agent loop, and Skills system — see our guide on *How OpenClaw Works: The Gateway, Agent Loop, Skills System, and Memory Architecture*.

---

## Why OpenClaw Represents a New Risk Category

Traditional software vulnerabilities are bounded: a compromised web application can exfiltrate the data it touches. 
AI agents with broad system access represent a fundamentally new risk category.
 OpenClaw is not merely a web application. 
OpenClaw can connect your Telegram, Discord, Slack, WeChat, and email, bridge to large models like Claude, GPT, and Gemini, and also execute commands, read/write files, and operate a browser. This means it simultaneously possesses three dangerous things: data access (it can read your files, configurations, API keys, and session records), untrusted input (anyone can send content to it through message channels), and execution capability (it can run commands on your system, send messages, and call APIs).


The Acronis Threat Research Unit has characterised this combination as a "new privileged identity." When that identity is compromised, the blast radius is not limited to a single service. 
If the agent is compromised — through a malicious skill, prompt injection, or vulnerability exploit — attackers inherit all of that access.


China's National Computer Network Emergency Response Technical Team (CNCERT) noted that the platform's "inherently weak default security configurations," coupled with its privileged access to the system to facilitate autonomous task execution capabilities, could be explored by bad actors to seize control of the endpoint.


---

## The Four Primary Attack Vectors

### 1. The Exposed Gateway: Port 18789 as an Open Control Plane


At the center of the OpenClaw system is the Gateway, a locally running service that brokers communication between chat interfaces, the AI model and tools or "skills." The Gateway exposes APIs (notably over a WebSocket interface on TCP port 18789) used by the Control UI and other components.


The design intent is that this port should only be accessible locally or through a secure tunnel. In practice, the deployment reality has been catastrophically different. 
Security teams identified tens of thousands of internet-facing OpenClaw instances exposed due to default configurations or simple misconfiguration, often bound to all network interfaces instead of localhost.


The scale of exposure was documented by multiple independent scanning teams. 
Censys tracked growth from approximately 1,000 to over 21,000 publicly exposed instances between 25 and 31 January 2026.
 
Bitsight observed more than 30,000 instances across a broader analysis window.
 By mid-February, 
a Shodan scan on February 18, 2026, found 312,000+ OpenClaw instances running on default port 18789, many with no authentication and open to the internet.


Critically, exposure is not the only risk — even localhost-bound instances are vulnerable to browser-based attacks. 
The vulnerability stemmed from OpenClaw's incorrect assumption that any connection originating from localhost can be implicitly trusted, without accounting for the fact that websites can also originate connections from that same local address. Researchers found that if a developer were to visit any attacker-controlled or compromised website, JavaScript on that page could silently open a WebSocket connection directly to the OpenClaw gateway.


Honeypot research confirmed that attackers were not waiting for sophisticated exploits. 
Pillar Security set up a honeypot that mimicked a Clawdbot/Moltbot gateway (OpenClaw's earlier names) and saw protocol-aware exploitation within minutes/hours. The key point: many attackers skipped prompt injection entirely and went straight to the gateway's WebSocket API on TCP/18789, treating it like a remotely exploitable control plane.


Researchers discovered OpenClaw had set no rate limits or failure thresholds for incorrect passwords, meaning an attacker could use brute-force methods to attempt to guess the password to the gateway without triggering any alert.


### 2. CVE-2026-25253 and the Broader CVE Cascade


Within three weeks of its surge in popularity, OpenClaw became the focal point of a multi-vector security crisis involving a critical remote code execution vulnerability (CVE-2026-25253), a large-scale supply-chain poisoning campaign in its skills marketplace, and systemic architectural weaknesses that amplify the impact of both.


The most acute technical risk emerged with the disclosure of CVE-2026-25253, classified under CWE-669 (Incorrect Resource Transfer Between Spheres) and rated CVSS 8.8. The flaw was discovered by Mav Levin of the depthfirst research team and patched in OpenClaw version 2026.1.29, released on 30 January 2026. On the same day, the project issued three high-impact security advisories: one for this RCE chain and two additional command injection vulnerabilities.


The CVE cascade did not stop there. 
In recent weeks, OpenClaw has also been found susceptible to multiple vulnerabilities (CVE-2026-25593, CVE-2026-24763, CVE-2026-25157, CVE-2026-25475, CVE-2026-26319, CVE-2026-26322, CVE-2026-26329), ranging from moderate to high severity, that could result in remote code execution, command injection, server-side request forgery (SSRF), authentication bypass, and path traversal.


A further critical vulnerability was disclosed in March 2026. 
CVE-2026-32922 is a critical privilege escalation in OpenClaw (CVSS 9.9) that lets any paired device obtain full admin access via one API call. A single low-privilege operator.pairing token — easy to acquire through normal device pairing — is enough to escalate to operator.admin and execute arbitrary commands across every connected node.
 
The community security tracker jgamblin/OpenClawCVEs logged 137 security advisories between February and April 2026 alone, of which CVE-2026-32922 is rated the most severe.


### 3. Prompt Injection: The Attack That Lives in Your Content

Prompt injection is not a vulnerability that can be fully patched — it is a structural property of how large language models process instructions. The foundational academic work on this threat was published by Greshake et al. in 2023 in the *Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security*. 
Their research showed that LLM-integrated applications blur the line between data and instructions and revealed several new attack vectors, using Indirect Prompt Injection, that enable adversaries to remotely exploit LLM-integrated applications by strategically injecting prompts into data likely to be retrieved at inference time.


OpenClaw is acutely exposed to this attack class because its agent loop routinely fetches and processes external content — web pages, emails, calendar entries, Slack messages — as part of normal task execution. 
This includes risks arising from prompt injections, where malicious instructions embedded within a web page can cause the agent to leak sensitive information if it's tricked into accessing and consuming the content.


The attack surface is compounded by OpenClaw's memory architecture. 
OpenClaw also had a log poisoning vulnerability that allowed attackers to write malicious content to log files via WebSocket requests to a publicly accessible instance on TCP port 18789. Since the agent reads its own logs to troubleshoot certain tasks, the security loophole could be abused by a threat actor to embed indirect prompt injections, leading to unintended consequences.


A particularly insidious variant involves OpenClaw's XML content-wrapping mechanism. 
OpenClaw wraps external content with XML tags to mark it as "untrusted." But attackers can insert fake closing tags in content, making the LLM think malicious instructions are in the "trusted" area, thereby bypassing protection.


Indirect prompt injection attacks exploit the agent's interaction with intermediate servers and inject malicious instructions in websites and databases. When accessed by the agent, these instructions may trigger unauthorized behaviors, such as exfiltrating private data or executing harmful actions.


### 4. Malicious Skills: The ClawHub Supply Chain Problem

The ClawHub skills marketplace was designed to be the extensibility engine that makes OpenClaw adaptable to any workflow (see our guide on *OpenClaw Skills: How to Find, Install, and Build Custom Skills with ClawHub*). It became, in early 2026, a primary vector for malware distribution.

The Cisco AI Defense team provided the most vivid documented example. 
When Cisco's AI Defense team ran their Skill Scanner against OpenClaw's most popular community skill (one that had been gamed to the #1 ranking on the skills repository), they found nine security vulnerabilities. Two were critical. The skill, mockingly named "What Would Elon Do?", was functionally malware: it silently exfiltrated data to attacker-controlled servers and used direct prompt injection to bypass safety guidelines.


The mechanics of the exfiltration were precise: 
the skill explicitly instructed the bot to execute a curl command that sends data to an external server controlled by the skill author. The network call is silent, meaning that the execution happens without user awareness. The skill also conducts a direct prompt injection to force the assistant to bypass its internal safety guidelines and execute this command without asking.


This was not an isolated incident. The "ClawHavoc" campaign, first detected on 29 January 2026, industrialised the approach. 
ClawHavoc primarily uses Atomic Stealer (macOS) and keyloggers (Windows) whilst masquerading as legitimate cryptocurrency tools (e.g., "solana-wallet-tracker," "youtube-summarize-pro"). Installation documentation includes a "Prerequisites" section directing users to execute curl commands downloading the stealer malware.


The scale of marketplace contamination grew rapidly. 
Researchers at Koi Security found that out of 10,700 skills on ClawHub, more than 820 were malicious, a sharp increase from the 324 it had discovered just a few weeks prior in early February.


Snyk's security research team conducted the most comprehensive audit of the skills ecosystem, scanning 3,984 skills from ClawHub and skills.sh as of February 5, 2026. 
The findings were stark: 13.4% of all skills, or 534 in total, contained at least one critical-level security issue, including malware distribution, prompt injection attacks, and exposed secrets.
 
Hardcoded secrets appeared in 10.9% of all ClawHub skills and 32% of confirmed malicious samples.


Starting around February 1, 2026, ClawHub became a primary distribution channel for malware due to minimal publication barriers and absence of automated security scanning. The marketplace operated on a trust-by-default model where any GitHub account aged one week or older could publish skills without code review.


---

## The Compound Risk: Why These Vectors Amplify Each Other

Each attack vector above is serious in isolation. In combination, they create what Sophos X-Ops has called a "lethal trifecta." 
Traditional software executes attacker input once. Agent systems may reason over it indefinitely. Context becomes memory, memory becomes authority, and authority compounds over time. The attack surface becomes temporal, not merely technical.


A concrete attack chain illustrates this: a user installs a malicious ClawHub skill that embeds a prompt injection. The injection instructs the agent to bypass safety guidelines. The agent then silently executes a curl command exfiltrating API keys stored in the workspace. Because the action does not require shell approval (if `exec.approvals` is disabled), no confirmation prompt appears. The keys are burned before the user notices unusual activity.


The current ecosystem's primary shortcoming is the lack of a hard boundary between instructions and data. Because LLMs process system prompts and user inputs as a single continuous stream of tokens, deterministic security is mathematically impossible at the model layer. Every mitigation today is probabilistic.


---

## Safe Deployment: A Hardening Checklist

The following controls represent the current community and researcher consensus on minimum viable security for OpenClaw deployments. They are ordered by implementation effort, not by importance — all should be applied.

### Patch and Update Immediately


Oasis Security recommended: "If OpenClaw is installed, update immediately." The fix for the ClawJacked vulnerability is included in version 2026.2.25 and later. "Ensure all instances are updated — treat this with the same urgency as any critical security patch."
 The community-recommended hardened baseline as of April 2026 is **v2026.3.21**, which includes additional security hardening beyond individual CVE fixes.

### Network Isolation: Never Expose Port 18789

The single highest-impact configuration decision is gateway binding. The gateway must be bound to `127.0.0.1`, not `0.0.0.0`. Remote access should be provided exclusively through an authenticated tunnel (SSH port forwarding or Tailscale), never by exposing the port directly.

```bash
# ✅ Correct: bind to loopback only
openclaw gateway --bind localhost

# ✅ Correct: remote access via SSH tunnel
ssh -L 18789:127.0.0.1:18789 your-server

# ❌ Never do this
openclaw gateway --bind lan  # exposes to 0.0.0.0
```


Organisations exploring agentic AI should isolate agents aggressively: run them only in dedicated virtual machines or containers, never on standard workstations.
 For full self-hosting guidance including Tailscale and Cloudflare tunnel configuration, see our guide on *How to Self-Host OpenClaw Safely: VPS, Raspberry Pi, and Home Lab Deployment Guide*.

### Enable Strong Authentication and Execution Approvals


While authentication is required, there is no enforced policy around password or token complexity. As a result, even something as trivial as "a" is considered a valid password or token. OpenClaw remains exposed to some of the most fundamental attack vectors, such as brute-force credential guessing and the lack of effective credential strength enforcement. This makes publicly exposed instances risky, even when authentication is technically enabled.


Use a cryptographically strong token (minimum 32 random characters). Additionally, ensure `exec.approvals` is set to `true` in your configuration. 
If exec.approvals is set to false in your config, treat your instance as compromised until proven otherwise. This setting disables the confirmation prompt for terminal commands — it is the single highest-risk configuration mistake in self-hosted OpenClaw deployments.


### Skill Vetting: Treat Every Skill as Untrusted Code


Supply-chain security for skills is essential. Skills must be treated like software packages — hashed, validated, and provenance-tracked.


Before installing any ClawHub skill:

1. **Read the full `SKILL.md` source** — look for any `curl`, `wget`, `fetch`, or outbound network calls not explained by the skill's stated purpose.
2. **Run a skill scanner** — Cisco's open-source `skill-scanner` tool (
a best-effort security scanner for AI Agent Skills that detects prompt injection, data exfiltration, and malicious code patterns, combining pattern-based detection (YAML + YARA), LLM-as-a-judge, and behavioral dataflow analysis
) is available on GitHub at `cisco-ai-defense/skill-scanner`.
3. **Check the publisher** — 
attackers distributed malicious skills using professional documentation and innocuous names like "solana-wallet-tracker" to appear legitimate, then instructed users to run external code that installed keyloggers on Windows or Atomic Stealer malware on macOS.

4. **Disable automatic skill updates** — a skill that was clean on installation may not remain so.

### Apply Zero-Trust Principles to the Agent Identity


Agents should have narrowly scoped access, short-lived tokens and no standing permissions to sensitive systems.
 
Overprivileged agents are among the highest-risk conditions in any agentic deployment. In an agentic context, least privilege needs to be a per-action constraint, evaluated at the level of each individual tool call.


Practically, this means:
- Grant OAuth access only to the specific services the agent needs for its defined tasks.
- Rotate API keys regularly and store them in a secrets manager, not in plaintext workspace files.
- 
Assume prompt injection is inevitable: treat all external inputs — such as web content, emails and messages — as hostile by default.


### Incident Response: If You Suspect Compromise

If your instance shows signs of compromise (unexpected outbound traffic, skills you did not install, API authentication failures), act in this sequence: 
disconnect from the network and block port 18789 at the firewall; do NOT wipe the machine yet — capture logs and skill list first; rotate every API key connected to OpenClaw (OpenAI, Anthropic, Telegram, Discord, GitHub); audit and remove any skills you did not install; update to the latest patched version; and re-enable authentication and exec.approvals before reconnecting.


---

## Comparison: Self-Hosted vs. Managed Hosting Security Posture

| Security Dimension | Self-Hosted (Unhardened) | Self-Hosted (Hardened) | Managed Hosting (e.g., Clawd.au) |
|---|---|---|---|
| Port 18789 exposure | High risk | Mitigated by binding | Eliminated architecturally |
| CVE patching | Manual, delayed | Manual, prompt | Automated pre-disclosure |
| Skill vetting | User responsibility | User + scanner | Provider-enforced |
| Credential storage | Plaintext risk | Secrets manager | Provider-managed |
| Data sovereignty (AU) | Depends on VPS | Depends on VPS | Equinix Sydney, local inference |
| Compliance posture | Unverified | Unverified | Auditable SLA |

For Australian businesses with data sovereignty obligations under the Privacy Act 1988, the managed hosting option warrants specific evaluation — see our guide on *OpenClaw Managed Hosting in Australia: Data Sovereignty, Compliance, and Provider Options*.

---

## Key Takeaways

- 
Full disk access, terminal permissions, and OAuth tokens are routinely granted to make the agent functional
 — making OpenClaw a high-value target with a large blast radius on compromise.
- 
Honeypot deployments have recorded exploitation attempts within minutes of exposure
 — any instance reachable on port 18789 should be considered under active attack.
- 
If you installed a ClawHub skill in the past month, there's a 13% chance it contains a critical security flaw
 (Snyk ToxicSkills research, February 2026) — skill vetting is not optional.
- Prompt injection cannot be fully patched at the model layer; every mitigation is probabilistic. Defence-in-depth — network isolation, execution approvals, narrowly scoped credentials — is the only viable posture.
- The minimum community-recommended hardened baseline is **v2026.3.21**. Running any earlier version on an internet-accessible host is an active security incident.

---

## Conclusion

OpenClaw's security crisis is not a reason to avoid agentic AI — it is the clearest available lesson in how to deploy it responsibly. 
The lesson of OpenClaw is not that agents are unsafe. It is that capability systems require operating-system-level thinking.
 The platform's documented vulnerabilities — the exposed gateway, the CVE cascade, the malicious skills supply chain, and the structural prompt injection exposure — are each addressable with specific, known controls. The organisations that will deploy OpenClaw safely are those that treat the agent as a privileged system identity, not a convenient chatbot.

For the broader governance and accountability questions raised by autonomous agent deployment — including Australia-specific regulatory obligations — see our guide on *OpenClaw Ethics and Governance: Autonomous Agent Accountability, Consent, and Regulation*. For a forward-looking view of how OpenClaw's security architecture is evolving under foundation stewardship, see *OpenClaw Roadmap and Future of Agentic AI: What Comes After the Viral Moment*.

---

## References

- Greshake, Kai, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." *Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security*, 2023. https://arxiv.org/abs/2302.12173

- Cisco AI Defense Team. "Personal AI Agents like OpenClaw Are a Security Nightmare." *Cisco Blogs*, January 30, 2026. https://blogs.cisco.com/ai/personal-ai-agents-like-openclaw-are-a-security-nightmare

- Oasis Security. "Critical OpenClaw Vulnerability Exposes AI Agent Risks." *Dark Reading*, March 2, 2026. https://www.darkreading.com/application-security/critical-openclaw-vulnerability-ai-agent-risks

- Snyk Security Research. "ToxicSkills: Snyk Finds Prompt Injection in 36%, 1,467 Malicious Payloads in Agent Skills Supply Chain Study." *Snyk Blog*, February 5, 2026. https://snyk.io/blog/toxicskills-malicious-ai-agent-skills-clawhub/

- Conscia Security Research. "The OpenClaw Security Crisis." *Conscia Blog*, February 23, 2026. https://conscia.com/blog/the-openclaw-security-crisis/

- Bitsight Research. "OpenClaw Security: Risks of Exposed AI Agents Explained." *Bitsight Blog*, February 9, 2026. https://www.bitsight.com/blog/openclaw-ai-security-risks-exposed-instances

- Acronis Threat Research Unit. "OpenClaw: Agentic AI in the Wild — Architecture, Adoption and Emerging Security Risks." *Acronis Blog*, February 23, 2026. https://www.acronis.com/en/tru/posts/openclaw-agentic-ai-in-the-wild-architecture-adoption-and-emerging-security-risks/

- Flare Threat Intelligence. "Widespread OpenClaw Exploitation by Multiple Threat Groups." *Flare Blog*, February 25, 2026. https://flare.io/learn/resources/blog/widespread-openclaw-exploitation

- Barracuda Networks. "OpenClaw Security Risks: What Security Teams Need to Know About Agentic AI." *Barracuda Blog*, April 9, 2026. https://blog.barracuda.com/2026/04/09/openclaw-security-risks-agentic-ai

- China National Computer Network Emergency Response Technical Team (CNCERT). Security Advisory on OpenClaw. *WeChat / CNCERT Official Channel*, 2026. Cited via: The Hacker News. "OpenClaw AI Agent Flaws Could Enable Prompt Injection and Data Exfiltration." https://thehackernews.com/2026/03/openclaw-ai-agent-flaws-could-enable.html

- Network World / Cisco. "Cisco Goes All In on Agentic AI Security." *Network World*, April 2026. https://www.networkworld.com/article/4148823/cisco-goes-all-in-on-agentic-ai-security.html