Taming OpenClaw (not easy)
- Ken Munson
- Mar 2
- 4 min read
Updated: Mar 3
Hardening an Autonomous Agent Runtime: Installing and Governing OpenClaw on a VPS
From Installation to Governance: Turning an Agent Runtime into a Controlled System

Executive Summary
In this project, I deployed OpenClaw, an autonomous agent runtime, on a hardened VPS environment and layered execution governance controls around it.
The goal was not simply to “run an agent,” but to:
Understand how tool execution works at the OS level
Validate container namespace isolation
Separate reasoning from execution
Implement audit visibility for OS-level commands
Add operational supervision to the monitoring layer
The result was a sidecar-based governance architecture that provides deterministic visibility into tool execution — without modifying vendor runtime code.
This post documents:
The initial OpenClaw deployment
Container and network hardening
Tool execution verification
Billing and quota debugging
The implementation of a governance sidecar
Lessons learned about AI runtime architecture
Phase 1: Deploying OpenClaw in a Hardened VPS Environment
Infrastructure
Ubuntu 24.04 VPS
Docker-based deployment
UFW firewall restricted to home IP
SSH key-only authentication
Password login disabled
OpenClaw running in container
Docker restart policy: unless-stopped
Explicit volume mount: ./data:/data
This created clear isolation boundaries:
Internet → VPS → Docker → Container namespace → Agent runtime
Container-Level Isolation
One of the first validations was confirming where tool execution actually occurs.
When invoking:
uname -avia OpenClaw’s exec tool:
The command executed inside the container namespace
Files written to /tmp remained inside container /tmp
No host-level filesystem modification occurred
This confirmed that Linux mount and PID namespaces were functioning as intended.
Phase 2: Understanding Tool Execution Semantics
OpenClaw exposes OS-level commands through a structured tool:
tool: execThe agent does not directly execute shell commands.
Instead:
The model emits structured intent.
The runtime invokes the tool.
The result is injected back into the model context.
This separation reinforced a key architectural insight:
The model reasons. The runtime executes. The system orchestrates.
Phase 3: The Governance Problem
Once exec was verified to run inside the container, the next question emerged:
How do we know which commands were truly executed?
Specifically:
How do we distinguish simulated output from real execution?
How do we audit tool usage?
How do we supervise execution behavior?
Rather than modifying OpenClaw’s minified runtime code (which proved brittle), I implemented a detective control sidecar.
Phase 4: Implementing an Execution Governance Sidecar
Initial Attempt (Rejected)
Background watcher process inside container
Cron-based restart logic
Lock-file duplication prevention
This approach worked but introduced lifecycle fragility and race conditions.
Lesson learned:
Background processes + cron supervision inside containers are brittle.
Final Architecture: Sidecar Pattern
The governance monitor was moved into its own container service:
services: openclaw: ... watcher: image: ghcr.io/hostinger/hvps-openclaw:latest command: ["/usr/bin/python3", "-u", "/data/watch_exec.py"] restart: unless-stopped volumes: - ./data:/dataWhat the Watcher Does
Tails OpenClaw session JSONL logs
Detects toolCall events where name = exec
Captures corresponding toolResult
Writes structured audit entries to:
/data/governance.jsonlEach entry includes:
Timestamp
Correlation ID
Command string
Exit code
Duration
Working directory
Aggregated output
The UI now clearly shows:
EXECUTED (tool: exec):Linux a198d8664793 ...This creates explicit execution provenance.
Phase 5: Billing and Quota Debugging
During testing, OpenClaw began returning:
“API rate limit reached”
Log inspection revealed the true error:
“You exceeded your current quota”
Investigation showed:
Organization budget was configured
Project budget was configured
But prepaid credit balance was $0
Auto-recharge was disabled
Enabling auto-recharge and adding credit immediately restored functionality.
Key insight:
There are three separate limit layers:
Rate limits (TPM/RPM)
Budget caps (org/project)
Prepaid credit balance
Understanding that distinction was critical.
Final Runtime Architecture
Internet->VPS (UFW restricted)->Docker bridge network->OpenClaw container->Tool execution (exec)->Sidecar watcher container->Persistent audit log
Controls now include:
Control Type | Implementation |
Preventive | UFW firewall, container namespace isolation |
Detective | Sidecar execution audit logging |
Corrective | Docker restart policy on watcher |
What I Learned
1. AI Agents Are Just Systems
An “autonomous agent” is not magic.
It is:
A reasoning engine
A tool invocation layer
A runtime orchestrator
A containerized process
Understanding the infrastructure layer is more important than the prompt layer.
2. Separation of Concerns Is Critical
Do not modify vendor runtime code unless necessary.
Instead:
Add sidecars
Add observability
Add structured monitoring
This keeps the control plane separate from the reasoning plane.
3. Namespace Isolation Matters
Verifying that exec runs inside the container — not on the host — was a key security boundary validation.
Never assume isolation. Test it.
4. Governance Can Be Added Without Blocking Capability
The system still allows execution.
But now:
Every execution is attributable
Every execution is logged
The monitor is supervised
Crashes are auto-recovered
That is operational maturity.
Why This Matters
This project demonstrates competency in:
Containerized AI runtime deployment
Linux namespace verification
OS-level tool execution controls
Audit log design
Sidecar architecture patterns
Billing/quota debugging
Operational supervision
More importantly, it demonstrates understanding of:
Trust boundaries
Execution provenance
Separation of reasoning vs control plane
Detective vs preventive controls
Runtime governance for AI systems
Final Reflection
The most valuable realization from this project was not about OpenClaw itself.
It was architectural:
Autonomous agents are not just LLMs. They are systems with execution surfaces, trust boundaries, and governance requirements.
Once that mental model clicked, the project shifted from “experimenting with an agent” to “engineering a controlled runtime.”
That distinction matters.
Oh and, here is an interesting interview with the creator of OpenClaw, Peter Steinberger:




Comments