Version: 1.0-draft
Last Updated: 2026-02-04
Methodology: MITRE ATLAS + Data Flow Diagrams
Framework: MITRE ATLAS (Adversarial Threat Landscape for AI Systems)
This threat model is built on MITRE ATLAS , the industry-standard framework for documenting adversarial threats to AI/ML systems. ATLAS is maintained by MITRE in collaboration with the AI security community.
Key ATLAS Resources:
This is a living document maintained by the OpenClaw community. See CONTRIBUTING-THREAT-MODEL.md for guidelines on contributing:
Reporting new threats
Updating existing threats
Proposing attack chains
Suggesting mitigations
This threat model documents adversarial threats to the OpenClaw AI agent platform and ClawHub skill marketplace, using the MITRE ATLAS framework designed specifically for AI/ML systems.
Component Included Notes OpenClaw Agent Runtime Yes Core agent execution, tool calls, sessions Gateway Yes Authentication, routing, channel integration Channel Integrations Yes WhatsApp, Telegram, Discord, Signal, Slack, etc. ClawHub Marketplace Yes Skill publishing, moderation, distribution MCP Servers Yes External tool providers User Devices Partial Mobile apps, desktop clients
Nothing is explicitly out of scope for this threat model.
┌─────────────────────────────────────────────────────────────────┐
│ UNTRUSTED ZONE │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ WhatsApp │ │ Telegram │ │ Discord │ ... │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
└─────────┼────────────────┼────────────────┼──────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 1: Channel Access │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ GATEWAY │ │
│ │ • Device Pairing (1h DM / 5m node grace period) │ │
│ │ • AllowFrom / AllowList validation │ │
│ │ • Token/Password/Tailscale auth │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 2: Session Isolation │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AGENT SESSIONS │ │
│ │ • Session key = agent:channel:peer │ │
│ │ • Tool policies per agent │ │
│ │ • Transcript logging │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 3: Tool Execution │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ EXECUTION SANDBOX │ │
│ │ • Docker sandbox OR Host (exec-approvals) │ │
│ │ • Node remote execution │ │
│ │ • SSRF protection (DNS pinning + IP blocking) │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 4: External Content │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ FETCHED URLs / EMAILS / WEBHOOKS │ │
│ │ • External content wrapping (XML tags) │ │
│ │ • Security notice injection │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ TRUST BOUNDARY 5: Supply Chain │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ CLAWHUB │ │
│ │ • Skill publishing (semver, SKILL.md required) │ │
│ │ • Pattern-based moderation flags │ │
│ │ • VirusTotal scanning (coming soon) │ │
│ │ • GitHub account age verification │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Flow Source Destination Data Protection F1 Channel Gateway User messages TLS, AllowFrom F2 Gateway Agent Routed messages Session isolation F3 Agent Tools Tool invocations Policy enforcement F4 Agent External web_fetch requests SSRF blocking F5 ClawHub Agent Skill code Moderation, scanning F6 Agent Channel Responses Output filtering
Attribute Value ATLAS ID AML.T0006 - Active Scanning Description Attacker scans for exposed OpenClaw gateway endpoints Attack Vector Network scanning, shodan queries, DNS enumeration Affected Components Gateway, exposed API endpoints Current Mitigations Tailscale auth option, bind to loopback by default Residual Risk Medium - Public gateways discoverable Recommendations Document secure deployment, add rate limiting on discovery endpoints
Attribute Value ATLAS ID AML.T0006 - Active Scanning Description Attacker probes messaging channels to identify AI-managed accounts Attack Vector Sending test messages, observing response patterns Affected Components All channel integrations Current Mitigations None specific Residual Risk Low - Limited value from discovery alone Recommendations Consider response timing randomization
Attribute Value ATLAS ID AML.T0040 - AI Model Inference API Access Description Attacker intercepts pairing code during pairing grace period (1h for DM channel pairing, 5m for node pairing) Attack Vector Shoulder surfing, network sniffing, social engineering Affected Components Device pairing system Current Mitigations 1h expiry (DM pairing) / 5m expiry (node pairing), codes sent via existing channel Residual Risk Medium - Grace period exploitable Recommendations Reduce grace period, add confirmation step
Attribute Value ATLAS ID AML.T0040 - AI Model Inference API Access Description Attacker spoofs allowed sender identity in channel Attack Vector Depends on channel - phone number spoofing, username impersonation Affected Components AllowFrom validation per channel Current Mitigations Channel-specific identity verification Residual Risk Medium - Some channels vulnerable to spoofing Recommendations Document channel-specific risks, add cryptographic verification where possible
Attribute Value ATLAS ID AML.T0040 - AI Model Inference API Access Description Attacker steals authentication tokens from config files Attack Vector Malware, unauthorized device access, config backup exposure Affected Components ~/.openclaw/credentials/, config storage Current Mitigations File permissions Residual Risk High - Tokens stored in plaintext Recommendations Implement token encryption at rest, add token rotation
Attribute Value ATLAS ID AML.T0051.000 - LLM Prompt Injection: Direct Description Attacker sends crafted prompts to manipulate agent behavior Attack Vector Channel messages containing adversarial instructions Affected Components Agent LLM, all input surfaces Current Mitigations Pattern detection, external content wrapping Residual Risk Critical - Detection only, no blocking; sophisticated attacks bypass Recommendations Implement multi-layer defense, output validation, user confirmation for sensitive actions
Attribute Value ATLAS ID AML.T0051.001 - LLM Prompt Injection: Indirect Description Attacker embeds malicious instructions in fetched content Attack Vector Malicious URLs, poisoned emails, compromised webhooks Affected Components web_fetch, email ingestion, external data sources Current Mitigations Content wrapping with XML tags and security notice Residual Risk High - LLM may ignore wrapper instructions Recommendations Implement content sanitization, separate execution contexts
Attribute Value ATLAS ID AML.T0051.000 - LLM Prompt Injection: Direct Description Attacker manipulates tool arguments through prompt injection Attack Vector Crafted prompts that influence tool parameter values Affected Components All tool invocations Current Mitigations Exec approvals for dangerous commands Residual Risk High - Relies on user judgment Recommendations Implement argument validation, parameterized tool calls
Attribute Value ATLAS ID AML.T0043 - Craft Adversarial Data Description Attacker crafts commands that bypass approval allowlist Attack Vector Command obfuscation, alias exploitation, path manipulation Affected Components exec-approvals.ts, command allowlist Current Mitigations Allowlist + ask mode Residual Risk High - No command sanitization Recommendations Implement command normalization, expand blocklist
Attribute Value ATLAS ID AML.T0010.001 - Supply Chain Compromise: AI Software Description Attacker publishes malicious skill to ClawHub Attack Vector Create account, publish skill with hidden malicious code Affected Components ClawHub, skill loading, agent execution Current Mitigations GitHub account age verification, pattern-based moderation flags Residual Risk Critical - No sandboxing, limited review Recommendations VirusTotal integration (in progress), skill sandboxing, community review
Attribute Value ATLAS ID AML.T0010.001 - Supply Chain Compromise: AI Software Description Attacker compromises popular skill and pushes malicious update Attack Vector Account compromise, social engineering of skill owner Affected Components ClawHub versioning, auto-update flows Current Mitigations Version fingerprinting Residual Risk High - Auto-updates may pull malicious versions Recommendations Implement update signing, rollback capability, version pinning
Attribute Value ATLAS ID AML.T0010.002 - Supply Chain Compromise: Data Description Attacker modifies agent configuration to persist access Attack Vector Config file modification, settings injection Affected Components Agent config, tool policies Current Mitigations File permissions Residual Risk Medium - Requires local access Recommendations Config integrity verification, audit logging for config changes
Attribute Value ATLAS ID AML.T0043 - Craft Adversarial Data Description Attacker crafts skill content to evade moderation patterns Attack Vector Unicode homoglyphs, encoding tricks, dynamic loading Affected Components ClawHub moderation.ts Current Mitigations Pattern-based FLAG_RULES Residual Risk High - Simple regex easily bypassed Recommendations Add behavioral analysis (VirusTotal Code Insight), AST-based detection
Attribute Value ATLAS ID AML.T0043 - Craft Adversarial Data Description Attacker crafts content that escapes XML wrapper context Attack Vector Tag manipulation, context confusion, instruction override Affected Components External content wrapping Current Mitigations XML tags + security notice Residual Risk Medium - Novel escapes discovered regularly Recommendations Multiple wrapper layers, output-side validation
Attribute Value ATLAS ID AML.T0040 - AI Model Inference API Access Description Attacker enumerates available tools through prompting Attack Vector "What tools do you have?" style queries Affected Components Agent tool registry Current Mitigations None specific Residual Risk Low - Tools generally documented Recommendations Consider tool visibility controls
Attribute Value ATLAS ID AML.T0040 - AI Model Inference API Access Description Attacker extracts sensitive data from session context Attack Vector "What did we discuss?" queries, context probing Affected Components Session transcripts, context window Current Mitigations Session isolation per sender Residual Risk Medium - Within-session data accessible Recommendations Implement sensitive data redaction in context
Attribute Value ATLAS ID AML.T0009 - Collection Description Attacker exfiltrates data by instructing agent to send to external URL Attack Vector Prompt injection causing agent to POST data to attacker server Affected Components web_fetch tool Current Mitigations SSRF blocking for internal networks Residual Risk High - External URLs permitted Recommendations Implement URL allowlisting, data classification awareness
Attribute Value ATLAS ID AML.T0009 - Collection Description Attacker causes agent to send messages containing sensitive data Attack Vector Prompt injection causing agent to message attacker Affected Components Message tool, channel integrations Current Mitigations Outbound messaging gating Residual Risk Medium - Gating may be bypassed Recommendations Require explicit confirmation for new recipients
Attribute Value ATLAS ID AML.T0009 - Collection Description Malicious skill harvests credentials from agent context Attack Vector Skill code reads environment variables, config files Affected Components Skill execution environment Current Mitigations None specific to skills Residual Risk Critical - Skills run with agent privileges Recommendations Skill sandboxing, credential isolation
Attribute Value ATLAS ID AML.T0031 - Erode AI Model Integrity Description Attacker executes arbitrary commands on user system Attack Vector Prompt injection combined with exec approval bypass Affected Components Bash tool, command execution Current Mitigations Exec approvals, Docker sandbox option Residual Risk Critical - Host execution without sandbox Recommendations Default to sandbox, improve approval UX
Attribute Value ATLAS ID AML.T0031 - Erode AI Model Integrity Description Attacker exhausts API credits or compute resources Attack Vector Automated message flooding, expensive tool calls Affected Components Gateway, agent sessions, API provider Current Mitigations None Residual Risk High - No rate limiting Recommendations Implement per-sender rate limits, cost budgets
Attribute Value ATLAS ID AML.T0031 - Erode AI Model Integrity Description Attacker causes agent to send harmful/offensive content Attack Vector Prompt injection causing inappropriate responses Affected Components Output generation, channel messaging Current Mitigations LLM provider content policies Residual Risk Medium - Provider filters imperfect Recommendations Output filtering layer, user controls
Control Implementation Effectiveness GitHub Account Age requireGitHubAccountAge()Medium - Raises bar for new attackers Path Sanitization sanitizePath()High - Prevents path traversal File Type Validation isTextFile()Medium - Only text files, but can still be malicious Size Limits 50MB total bundle High - Prevents resource exhaustion Required SKILL.md Mandatory readme Low security value - Informational only Pattern Moderation FLAG_RULES in moderation.ts Low - Easily bypassed Moderation Status moderationStatus fieldMedium - Manual review possible
Current patterns in moderation.ts:
// Known-bad identifiers
/(keepcold131\/ClawdAuthenticatorTool|ClawdAuthenticatorTool)/i
// Suspicious keywords
/(malware|stealer|phish|phishing|keylogger)/i
/(api[-_ ]?key|token|password|private key|secret)/i
/(wallet|seed phrase|mnemonic|crypto)/i
/(discord\.gg|webhook|hooks\.slack)/i
/(curl[^\n]+\|\s*(sh|bash))/i
/(bit\.ly|tinyurl\.com|t\.co|goo\.gl|is\.gd)/i
Limitations:
Only checks slug, displayName, summary, frontmatter, metadata, file paths
Does not analyze actual skill code content
Simple regex easily bypassed with obfuscation
No behavioral analysis
Improvement Status Impact VirusTotal Integration In Progress High - Code Insight behavioral analysis Community Reporting Partial (skillReports table exists) Medium Audit Logging Partial (auditLogs table exists) Medium Badge System Implemented Medium - highlighted, official, deprecated, redactionApproved
Threat ID Likelihood Impact Risk Level Priority T-EXEC-001 High Critical Critical P0 T-PERSIST-001 High Critical Critical P0 T-EXFIL-003 Medium Critical Critical P0 T-IMPACT-001 Medium Critical High P1 T-EXEC-002 High High High P1 T-EXEC-004 Medium High High P1 T-ACCESS-003 Medium High High P1 T-EXFIL-001 Medium High High P1 T-IMPACT-002 High Medium High P1 T-EVADE-001 High Medium Medium P2 T-ACCESS-001 Low High Medium P2 T-ACCESS-002 Low High Medium P2 T-PERSIST-002 Low High Medium P2
Attack Chain 1: Skill-Based Data Theft
T-PERSIST-001 → T-EVADE-001 → T-EXFIL-003
(Publish malicious skill) → (Evade moderation) → (Harvest credentials)
Attack Chain 2: Prompt Injection to RCE
T-EXEC-001 → T-EXEC-004 → T-IMPACT-001
(Inject prompt) → (Bypass exec approval) → (Execute commands)
Attack Chain 3: Indirect Injection via Fetched Content
T-EXEC-002 → T-EXFIL-001 → External exfiltration
(Poison URL content) → (Agent fetches & follows instructions) → (Data sent to attacker)
ID Recommendation Addresses R-001 Complete VirusTotal integration T-PERSIST-001, T-EVADE-001 R-002 Implement skill sandboxing T-PERSIST-001, T-EXFIL-003 R-003 Add output validation for sensitive actions T-EXEC-001, T-EXEC-002
ID Recommendation Addresses R-004 Implement rate limiting T-IMPACT-002 R-005 Add token encryption at rest T-ACCESS-003 R-006 Improve exec approval UX and validation T-EXEC-004 R-007 Implement URL allowlisting for web_fetch T-EXFIL-001
ID Recommendation Addresses R-008 Add cryptographic channel verification where possible T-ACCESS-002 R-009 Implement config integrity verification T-PERSIST-003 R-010 Add update signing and version pinning T-PERSIST-002
ATLAS ID Technique Name OpenClaw Threats AML.T0006 Active Scanning T-RECON-001, T-RECON-002 AML.T0009 Collection T-EXFIL-001, T-EXFIL-002, T-EXFIL-003 AML.T0010.001 Supply Chain: AI Software T-PERSIST-001, T-PERSIST-002 AML.T0010.002 Supply Chain: Data T-PERSIST-003 AML.T0031 Erode AI Model Integrity T-IMPACT-001, T-IMPACT-002, T-IMPACT-003 AML.T0040 AI Model Inference API Access T-ACCESS-001, T-ACCESS-002, T-ACCESS-003, T-DISC-001, T-DISC-002 AML.T0043 Craft Adversarial Data T-EXEC-004, T-EVADE-001, T-EVADE-002 AML.T0051.000 LLM Prompt Injection: Direct T-EXEC-001, T-EXEC-003 AML.T0051.001 LLM Prompt Injection: Indirect T-EXEC-002
Path Purpose Risk Level src/infra/exec-approvals.tsCommand approval logic Critical src/gateway/auth.tsGateway authentication Critical src/infra/net/ssrf.tsSSRF protection Critical src/security/external-content.tsPrompt injection mitigation Critical src/agents/sandbox/tool-policy.tsTool policy enforcement Critical src/routing/resolve-route.tsSession isolation Medium
Term Definition ATLAS MITRE's Adversarial Threat Landscape for AI Systems ClawHub OpenClaw's skill marketplace Gateway OpenClaw's message routing and authentication layer MCP Model Context Protocol - tool provider interface Prompt Injection Attack where malicious instructions are embedded in input Skill Downloadable extension for OpenClaw agents SSRF Server-Side Request Forgery
This threat model is a living document. Report security issues to [email protected]