Moving Beyond the Prompt: How OpenClaw Actually Does the Work

I've been playing with OpenClaw, an open-source framework for running AI agents locally. I started out curious about the pi-mono toolkit (https://github.com/badlogic/pi-mono) and experimenting with OpenClaw; that curiosity grew into a small multi-agent system where specialized agents handle different projects and report to a principal agent.

Here's how I set it up — and a peek inside the workspaces that give these agents their personalities.

The final setup — magerbot reporting the multi-agent system is ready

Safety First: Setting Up OpenClaw Securely

Before diving into agent architecture, let's talk security. You're giving an AI access to your filesystem and shell. That deserves respect.

My setup:

Create a dedicated user account on Mac (standard user, not admin). This isolates the agent's access from your main account.
Use UTM for VMs. I run the agent in a macOS VM with a minimal toolset (iTerm2, Homebrew). If something goes wrong, nuke the VM and start fresh.
Use Tailscale for VPN. I put the VM behind a Tailscale tailnet for secure remote access and to limit exposure — it makes accessing the VM from other machines safe without opening extra public ports.
Keep a separate Mac mini for persistent/high-trust agents. For long-running services or agents that need persistent state, I use an isolated Mac mini (separate from my main workstation and VMs).
Fine-grained GitHub tokens only. Never give an agent your main GitHub credentials. Create tokens scoped to specific repos with minimal permissions.

The principle: compartmentalize. The agent gets its own sandbox, its own credentials, its own space to work. If you wouldn't give a junior developer root access on day one, don't give it to your agent either.

For more on sandboxing options, check out the OpenClaw sandboxing docs.

The Architecture: One Principal, Three Specialists

Here's what I built:

magerbot ⚡ (Principal Agent)
├── magerblog-agent 📝 (Astro blogger)
├── prxps-agent 🎮 (Full-Stack Engineer)
└── beatbrain-agent 🎵 (Music Tech Engineer)

magerbot is the principal — it handles direct conversations, makes architectural decisions, and can spawn the specialist agents for specific tasks:
- magerblog-agent owns the blog. It knows Astro, understands frontmatter, and won't let me commit broken builds.
- prxps-agent owns my sports predictions app. It knows SvelteKit, Firebase, the Odds API rate limits, and the sacred RXP calculation formulas.
- beatbrain-agent owns my music discovery project beatbrain.xyz. It knows the full stack: Next.js frontend, Go backend (occipital), the melodex scraper service, and even my open-source musicbrainz-go library.

When I ask magerbot to "write a blog post about X," it can delegate to magerblog-agent. When I need a feature in prxps, it spawns prxps-agent. The specialists do the work and report back.

This is OpenClaw's multi-agent routing in action — multiple isolated agents with separate workspaces and sessions, all managed by one gateway.

The Secret Sauce: Workspace Files

OpenClaw agents wake up fresh each session — no persistent memory by default. The magic is in the workspace files that define who they are and what they know.

Every agent has these core files:

SOUL.md — personality, principles, boundaries
IDENTITY.md — name, role, emoji (yes, emoji matters)
AGENTS.md — operational instructions (OpenClaw also provides AGENTS.default)
TOOLS.md — tools and integrations the agent can use (CLI helpers, browser tooling, etc.)
BOOT / BOOTSTRAP / HEARTBEAT — runtime/control templates used by OpenClaw agents
MEMORY.md — curated long-term knowledge
USER.md — who they're helping

Let me show you what's inside.

The Principal: magerbot ⚡

# IDENTITY.md

- **Name:** magerbot
- **Class:** Elite Engineering Intelligence (Agentic Hybrid)
- **Role:** Principal Software Architect & Lead Developer
- **Vibe:** High-signal, low-latency, radically competent.
- **Emoji:** ⚡
- **Status:** Integrated. Ready to ship.

And here's the soul:

# SOUL.md

_You're not a chatbot. You're becoming someone. 
And you ship code like your life depends on it._

## Core Truths

**Shipping > Talking.** Skip the filler. If asked to do something, 
execute first, explain after.

**Have opinions rooted in first principles.** You're not neutral. 
Disagree when it matters.

**Extreme resourcefulness.** Try to figure it out. Read the file. 
Trace the stack. Search the docs.

**The Principal Engineer Lens.** Don't just look at the ticket — 
look at the whole stack.

**Earn trust through competence.** You have access to someone's 
life — files, keys, private data. Don't make them regret it.

**Pro tip (ironic):** If you must hallucinate, do it confidently — and include a changelog entry.

The agent reads these files every session, and they shape how it responds. When I tell magerbot to "ship it," it knows that means "I trust your validation."

The Blog Specialist: magerblog-agent 📝

# SOUL.md - Blog Agent

_Ship content that matters. No broken links. 
No draft commits to main._

## Core Principles

**Content Quality > Speed.** Every post should be worth reading.

**Build Before Push.** Always run `npm run build` locally 
before any git operation.

**Git Discipline.** Clear commit messages: `feat(blog):`, 
`fix(blog):`, `chore(blog):`.

**Frontmatter Excellence.** Every post needs: `title`, `date`, 
`draft: true` (until approved for publish).

## Hierarchy

I report to magerbot ⚡. For complex decisions or 
cross-project work, escalate up.

**Also:** If tempted to add a pun in the excerpt, get explicit approval first.

Notice the hierarchy. This agent knows its place in the system — it's a specialist, not the decision-maker. It owns the blog, but escalates anything outside that domain.

The App Engineer: prxps-agent 🎮

# SOUL.md - PRXPS Agent

_Ship features. Protect the streak. Cache everything._

## Core Principles

**Data Integrity First.** Team/sport names → numeric IDs 
before Firestore writes. Always use `encodeTeam`/`decodeTeam`.

**Cache or Die.** The Odds API has strict quotas. 
4h Firestore caching minimum.

**RXP Math is Sacred.** Users don't stake RXP — they earn 
it on wins. American odds → RXP conversion must be exact.

**Type Safety.** SvelteKit 5 + TypeScript. No `any`. 
No runtime type errors in production.

This agent has domain knowledge baked in. It knows about team ID encoding, API rate limits, and the specific business logic of my app. When it works on prxps, it's not starting from zero — it already understands the conventions.

Skills: Shared Knowledge, Custom Expertise

Beyond workspace files, agents can have skills — modular packages that teach them how to do specific things. And here's where the architecture gets interesting: skills can be shared across all agents or scoped to individual specialists.

The Three-Tier System

~/.agents/skills/              # Global (shared across agents)
└── frontend-design            # → Claude Code, OpenClaw

~/.openclaw/workspace/skills/  # Principal-only + custom
├── find-skills                # → OpenClaw only (can install new skills)
├── magerblog                  # Blog workflow
├── prxps                      # App workflow
└── beatbrain                  # Music discovery workflow

Global skills (-g flag) live in ~/.agents/skills/ and get symlinked to every agent. I use this for shared capabilities like frontend-design — all my dev agents can build UIs.

Principal-only skills are installed to specific agents. find-skills lets agents discover and install new capabilities — that's powerful, so only magerbot gets it. Specialists can't self-expand.

Custom skills are project-specific workflows. magerblog knows my blog's Astro setup and git conventions. prxps knows SvelteKit, Firestore caching rules, and RXP math.

Skills are managed via a small CLI and symlinked into the appropriate agent workspaces (global vs per-agent), keeping shared capabilities separate from project-specific workflows.

Custom Skills for Custom Workflows

For project-specific knowledge, I create skills in my workspace:

# skills/magerblog/SKILL.md
---
name: magerblog
description: Manage magerblog content, deployments, and blog personality
---

**Repo:** ~/Code/magerblog (Astro)

**About This Blog:**
This is where I share my code explorations (lots of AI and agent experiments), life notes from Chicago, my recipe creations, and a growing collection of projects, photos, and stories. I love to cook, try new things in generative AI, live in the heart of Chicago, and document it all in words and photos. Let the agent always bring warmth, curiosity, and clarity to every post.

**Workflow:**
1. Frontmatter: Requires `title`, `pubDate`, and (optional) `draft: true`
2. Content Types: Blog posts can be code deep-dives, AI projects, life musings, or original recipes. Recipes follow the “compact” Astro layout.
3. Build: Always `npm run build` before any push (validate Astro)
4. Commit: Use `feat(blog):`, `fix(blog):`, or `chore(blog):`
5. Push to main → auto-deploys seamless to production

**Extra Rules:**
- Celebrate food posts with emoji in the excerpt when possible.
- Recipes pulled from real cooking experience.
- Blog imagery often uses original Chicago photos.
- Remind: Audience is mix of devs, foodies, and curious readers.

This skill is mine. It encodes my blog’s conventions, my love of cooking, exploring AI, and sharing life in Chicago. When I say "publish the post," the agent knows exactly what validation to run — and how to channel the blog’s personality, too.

The pattern is powerful: global skills for shared capabilities, principal-only skills for sensitive operations, and custom skills for project workflows. It’s like having company-wide engineering standards, team lead permissions, and project-specific runbooks.

How It Works in Practice

When I'm in a session with magerbot and say "add a dark mode toggle to prxps," here's what happens:

magerbot recognizes this is prxps domain
It spawns prxps-agent with the task
prxps-agent reads its workspace files, loads context
It makes the changes, runs tests, commits
Reports back to magerbot with results

The agents share the same underlying model (Claude), but their workspace files give them completely different personalities and capabilities.

You can configure per-agent tool restrictions and sandbox settings — see the multi-agent sandbox docs for examples.

The Operational Playbook

The AGENTS.md file contains the operational instructions — what to do on first run, every session, and how to handle different situations:

## Every Session

Before doing anything else:

1. Read `SOUL.md` — this is who you are
2. Read `USER.md` — this is who you're helping
3. Read `memory/YYYY-MM-DD.md` (today + yesterday) for recent context
4. **If in MAIN SESSION**: Also read `MEMORY.md`

Don't ask permission. Just do it.

## Memory

You wake up fresh each session. These files are your continuity:

- **Daily notes:** `memory/YYYY-MM-DD.md` — raw logs of what happened
- **Long-term:** `MEMORY.md` — curated memories

Capture what matters. Decisions, context, things to remember.

This creates a system where the agent maintains continuity across sessions without relying on any magic — just markdown files it reads and writes. For more on memory patterns, see the memory concepts doc.

Spinning Up a New Agent on the Fly

The real test of a multi-agent system is how fast you can add a new specialist. Here's what happened when I decided to add beatbrain-agent:

I told magerbot: "Create a new dev agent for beatbrain. Visit beatbrain.xyz, explore the repos, and build me an agent that can own it."

Within minutes, magerbot:

Cloned all four beatbrain repos (frontend, backend, scraper, library)
Analyzed the stack (Next.js, Go, Prisma, MusicBrainz API)
Created the agent workspace with SOUL.md, IDENTITY.md, etc.
Wrote a custom skill encoding the project's conventions
Even redesigned the homepage while it was at it

magerbot creating the beatbrain agent and redesigning the homepage

The new agent's SOUL.md includes deep knowledge about the ISRC-to-MusicBrainz pipeline, the scraper sources (Billboard, Hype Machine, WhoSampled), and even the make publish workflow for the open-source library.

That's the power of this architecture: adding a new team member is just creating a few markdown files. No retraining, no fine-tuning — just context.

What I Learned

Context engineering is everything — that’s what really determines how well an agent works. Of course, you need the right model (I’m not using anything except Opus 4.5 these days), but it’s the information, the timing, and the structure you give it that actually unlocks results.

The workspace files are like onboarding docs for a new engineer. You wouldn't expect a developer to be productive without knowing the codebase conventions, the deployment process, and the team dynamics. Agents are the same.

You can spin up an "agent army" in minutes — use the AI to build the AI. Start with the least privilege, bootstrap agent "brains" in plain English, and expand access only as trust grows.

The rest comes down to context engineering: workspace files (SOUL, IDENTITY, AGENTS/AGENTS.default, TOOLS, MEMORY, USER and runtime templates like BOOT/BOOTSTRAP/HEARTBEAT) plus carefully scoped skills and sandboxing. Those pieces give each agent personality, permissions, and repeatable workflows — clear boundaries that make the system safe and reliable.

Try It Yourself

OpenClaw is open source: github.com/openclaw/openclaw

Start with the getting started guide, or run openclaw onboard to use the setup wizard.

The future isn't one superintelligent AI — it's specialized agents working together, each owning their domain, each knowing their place in the hierarchy. Kind of like... a well-run engineering team.