top of page

Threat Modeling an Agentic AI System - OpenClaw: A Practical Walkthrough

  • Ken Munson
  • Mar 8
  • 5 min read

Updated: Mar 22

Introduction


Full disclosure: I wrote an introduction paragraph and ran it through ChatGPT - this is the combination of my original paragraph and some cleaned up verbiage by ChatGPT.


Introduction:

Traditional software systems are built on a predictable model: developers write code that processes input and produces output according to predefined logic. Security architecture for these systems focuses on protecting that deterministic behavior from manipulation.


Agentic AI systems introduce a fundamentally different paradigm. Instead of fixed program logic, these systems use large language models to reason about tasks and dynamically decide what actions to take using external tools, APIs, or system commands - often completely autonomously. This shift dramatically increases capability—but it also introduces a new category of risk. When an AI system can translate reasoning directly into real-world actions, the boundary between suggestion and execution becomes a critical security concern. Threat modeling these systems therefore requires new ways of thinking about trust boundaries, attack surfaces, and control points.



The OpenClaw Environment


To explore how threat modeling applies to agentic AI systems in practice, I built a small experimental environment using OpenClaw, an autonomous agent runtime capable of reasoning about tasks and executing system tools. The environment is intentionally simple but captures many of the core architectural elements found in modern agent frameworks. The system architecture is the next section.



System Architecture




Assets and Attack Surfaces


Assets represent components that must be protected from compromise or misuse. In this particular lab and Threat Modeling project, those are as follows:


Infrastructure Assets

  • VPS host operating system

  • Docker runtime

  • Container isolation boundaries

  • Firewall configuration

Application Assets

  • OpenClaw agent runtime

  • OpenClaw gateway interface

  • Execution monitoring sidecar

Data Assets

  • Workspace files

  • Agent memory files

  • Execution logs

  • Governance logs

Credential Assets

  • OpenAI API key

  • SSH Key

  • Environment variables

  • Credential files

Operational Assets

  • Tool execution capability (exec)

  • Monitoring and governance records

  • Container filesystem




Trust Boundaries





STRIDE Threat Enumeration for this OpenClaw Agent Lab


Using the system architecture and trust boundaries identified earlier, threats can be enumerated using the STRIDE framework. The table below highlights representative threat scenarios for key components of the OpenClaw lab environment, along with existing controls and potential mitigations.


Component / Boundary

STRIDE Category

Threat Scenario

Potential Impact

Existing Controls

Suggested Mitigations

Gateway Web Interface

Spoofing

Unauthorized user attempts to access the OpenClaw UI

Attacker gains control of agent tasks

UFW restricted to home IP

Add authentication layer or reverse proxy auth

SSH Administrative Access

Spoofing

Attacker attempts to impersonate admin via SSH

Host compromise

SSH key auth, firewall IP restriction

Fail2ban, rotate keys periodically

Agent Workspace Files

Tampering

Malicious modification of memory/config files

Agent behavior altered or poisoned

Container filesystem isolation

File integrity monitoring

Governance Logs

Tampering

Logs altered to hide malicious activity

Loss of forensic evidence

Sidecar watcher logs

Append-only logs or remote logging

Agent Tool Execution (exec)

Repudiation

Command execution without clear attribution

Hard to trace origin of actions

Watcher audit logs

Signed logs and correlation IDs

Agent Workspace Data

Information Disclosure

Agent exposes sensitive data through prompts or tool output

Credential or data leakage

Container scope

Output filtering, secret management

OpenAI API Key

Information Disclosure

API key exposed via logs or environment variables

Unauthorized API usage

Local credential storage

Secret vault / environment isolation

Gateway UI

Denial of Service

Flooding of requests or malformed input

Agent unavailable

Firewall restrictions

Rate limiting, reverse proxy

External LLM API

Denial of Service

LLM provider unavailable or slow

Agent reasoning fails

None

Fallback models or retry logic

Agent Tool Execution Layer

Elevation of Privilege

Prompt injection causes agent to run unsafe commands

System compromise inside container

Container isolation

Tool allowlist, policy guardrails

Docker Runtime

Elevation of Privilege

Container escape attempt

Host compromise

Namespace isolation

Seccomp, rootless containers

Reasoning → Action Boundary

Tampering / Elevation

Prompt injection manipulates LLM into executing commands

Arbitrary command execution

Container sandbox + watcher logs

Guardrails, approval gates, policy engine




Risk Matrix – OpenClaw Threat Model


And for good measure, lets throw in some risk ratings for some of the components mentioned above. For sure the big take-away here is the critical nature of the threat of prompt injections. This is where most Agentic AI security is focused right now - on this project and in the wild. The acronym below is one to become familiar with - it will be as important as (and goes hand in hand with) "HITL": RTAB - Reasoning-to-Action Boundary!! This is by far the most important area to focus on in Autonomous, Agentic AI systems. So much so that I am going to make a separate blog post just on that topic in the next 1-2 days. There is a lot to say there.


Threat

Likelihood

Impact

Risk Level

Notes

Prompt injection leading to command execution

High

High

Critical

Core RTAB risk in agent systems

Tool misuse via exec capability

Medium

High

High

Agent may execute unsafe commands

Credential exposure (OpenAI API key)

Medium

Medium

Medium

Possible via logs or workspace

Container escape

Low

High

Medium

Docker isolation lowers likelihood

Workspace file manipulation

Medium

Medium

Medium

Could alter agent behavior

Governance log tampering

Low

Medium

Low

Monitoring already implemented

Gateway interface compromise

Low

Medium

Low

Firewall restriction lowers exposure

SSH compromise

Low

High

Low

Key-based auth + IP restriction

LLM API outage

Medium

Low

Low

Availability issue rather than compromise




OpenClaw Risk Matrix with STRIDE and OWASP Agentic AI nomenclature Mapping


Okay, really the last thing, here is another table to tie the Threat to the appropriate OWASP Top 10 for Agentic Systems nomenclature.


Threat

Primary STRIDE

Best-Fit OWASP Agentic AI Mapping

Risk Rating

Notes

Prompt injection leading to command execution

Tampering / Elevation of Privilege

ASI01 + ASI05; secondary: ASI02, ASI06

Critical

Best fit for goal/decision-path manipulation that crosses the Reasoning-to-Action Boundary into execution. Secondary if legitimate tools or persisted context are involved.

Tool misuse via exec capability

Elevation of Privilege / Tampering

ASI02; secondary: ASI05, ASI03

High

Legitimate tool used in an unsafe way. Secondary if misuse becomes code execution or abuses inherited credentials/privileges.

Credential exposure (OpenAI API key)

Information Disclosure

ASI03; secondary: ASI02, ASI06

Medium

Best fit when agent identity, API keys, delegated trust, or cached credentials are exposed or abused.

Container escape

Elevation of Privilege

ASI05; secondary: ASI04

Medium

Best fit when prompt- or tool-driven execution leads to host/container compromise. Secondary if the path involves a malicious dependency, tool, or MCP component.

Workspace file manipulation

Tampering

ASI06; secondary: ASI01, ASI05

Medium

Best fit when files act as durable context or memory that can bias later reasoning, planning, or tool use.

Governance log tampering

Tampering / Repudiation

No clean ASI mapping

Low

Keep as a conventional control-plane threat in the model. Important, but not distinctly agentic.

Gateway interface compromise

Spoofing / Tampering

No clean ASI mapping; possible secondary: ASI03

Low

Mostly a standard web/admin attack surface unless identity or delegated trust is central to the compromise.

SSH compromise

Spoofing / Elevation of Privilege

No clean ASI mapping; possible secondary: ASI03

Low

Primarily an infrastructure/admin risk. Retain it in the threat model even though it is not uniquely agentic.

LLM API outage

Denial of Service

No direct ASI mapping; possible future fit: ASI08

Low

Today this is mainly an availability dependency. It becomes ASI08 only if failures propagate across chained agents or workflows.


This mapping uses STRIDE as the primary threat-enumeration framework and OWASP’s Agentic AI Top 10 as a modern AI-risk overlay. In practice, the first five rows map cleanly to agentic risks, while the last four remain important architectural and infrastructure threats even though they are not uniquely agentic.



Summary:


The analysis reveals an important pattern: while traditional security controls can mitigate many infrastructure risks, agentic AI systems introduce a new class of risk centered around the reasoning-to-action boundary. Understanding and controlling this boundary is essential for safely deploying autonomous agents. Again, there will be a separate deep dive post on this coming right after this post.

Comments


bottom of page