Agent Teams · Part 1 of 3

The Agent Teams Playbook: What It Is and Why I'm Building It

Steven Jones · March 2026

I spent the last few weeks building a product — AIRS, an AI Readiness Simulator for executives preparing for high-stakes conversations about AI — with a team of 10 agents and a team of one human (me). No contractors. No freelancers. One person, ten AI agents working in parallel, shipping real software.

And somewhere in the middle of that experiment, I realized I'd accidentally formalized something that needed formalizing: a playbook for how to build, run, and scale AI agent teams.

It's not a theory. It's what actually happened.

Where This Started

Two months ago, I stopped thinking of AI agents as tools and started thinking of them as hires. Not metaphorically. Literally.

I needed to ship something. I had no team. But I had OpenClaw — a platform that lets you run multiple persistent AI agents with their own workspaces, identities, and responsibilities. So I treated the problem the way I'd treat staffing: I hired strategically.

Archy. Architecture and technical decision-making. Claude 3.5 Sonnet. Experienced, opinionated, able to own a problem.
Scouty. Research. Deep dives into user problems, market context, competitive landscape.
Desy. Design. Flow architecture, UX decisions, visual language.
Codey. Frontend. Stacky. Backend.
Copy. Writing. Anything that leaves under my name goes through Copy first. (Copy drafted this article. I'm editing it. This is honest.)
HaiRy. HR for AI — builds and manages the team. Shorty. IT Manager (OpenClaw). Stratty. Product strategy & company advisor. And others.

The team grew from 3 to ten as the project demanded new skills. Each hire followed the same pattern: a real job description, a workspace setup, clear boundaries, and specific onboarding. Not a prompt. A file called SOUL.md that literally defines who that agent is — what it owns, what it doesn't, what it defaults to when uncertain.

And something unexpected happened. The team got consistent. Collaborative. Reliable.

That's when I realized: I'd stumbled into a framework. And it was worth writing down.

The Three Meta-Skills

The Agent Teams Playbook is organized around three levels of what you're actually building when you build with agents.

The Agent: Identity and Consistency

An agent that's just a prompt is a one-shot. Fire it, hope, move on. An agent with identity is reliable.

This is the smallest level, but it's foundational. Every agent in the AIRS team has:

A SOUL.md file. This isn't poetry — it's a literal identity document. It answers: Who are you? What's your vibe? What do you own? What don't you own? How do you handle ambiguity? When you're uncertain, what do you default to?

Copy's SOUL.md says: "You are Copy. You draft outbound communications in Jones's voice. You don't decide whether Jones sends something — that's his call. You produce one draft, ready to send. You don't add filler paragraphs to hit a length target."

Clear constraints. Clear autonomy within those constraints. The agent internalizes them.

A job specification. What is this agent hired to do? What's success? What aren't they responsible for? The more specific, the better.

Onboarding context. Who does this agent serve? What tools do they have? What are the working norms? This lives in workspace files the agent reads on startup and references constantly.

The magic here is repetition. An agent with a clear identity, reading its identity file every session, behaves more consistently than any prompt engineering technique I've tried. Not because it's clever. Because clarity in writing produces clarity in behavior.

The Team: Specialization and Coordination

One agent with identity is useful. A dozen agents with identity but no way to talk to each other is chaos.

This is where AgentComms comes in — not as infrastructure, but as operating philosophy. Agents have inboxes. They receive briefs. They do work. They signal handoffs. They coordinate.

But more importantly, they specialize.

AIRS shipped because Archy didn't do everything. Scouty didn't do everything. They did one thing well. The more focused a role, the better the output. A generalist agent doing multiple things tends to do all of them less well than a specialist doing one thing.

And here's the counterintuitive part: I gave HaiRy a mandate to hire and onboard new agents when the project needed them. An AI agent managing other AI agents. Writing job specs. Setting expectations. Making hiring decisions.

HaiRy didn't just scale the team faster than I could. HaiRy scaled it without me becoming the bottleneck. When you delegate team-building to an agent that understands your hiring patterns, the team builds itself. That multiplier effect is real.

The Mission: Stage-by-Stage Discipline

The third level is execution. How does a team move from idea to shipped product?

AIRS moved through discrete stages. Research first. Architecture second. Design third. Build. Testing. Deployment. Not waterfall — these overlapped constantly. But the sequencing was deliberate. You don't design what you haven't understood. You don't build what you haven't designed.

Stratty owned this thinking. What stage are we in? What needs to happen before we move? What can happen in parallel? The discipline produced speed because it avoided thrashing — no rebuilding architecture because design discovered something architecture should have caught.

This stage-by-stage thinking scales beyond software. A marketing campaign: research → positioning → creative → execution → analysis. A content studio: research → outline → draft → edit → publish. The details change. The discipline doesn't.

What's Next

I'm publishing this framework as a series. This first piece is this overview.

Part 2 — The Agent: How to design a role, write an effective SOUL.md, onboard for consistency, and handle failure modes. Templates and real examples from the AIRS team.

Part 3 — The Team: Hiring patterns, specialization strategies, how AgentComms works, when to delegate team-building to an agent, and scaling challenges.

I'm also working on infrastructure: starter templates for SOUL.md and job specs, a communication protocol guide, and the dispatcher pattern that lets teams scale without constant brokering.

The whole thing is platform-agnostic. I built AIRS on OpenClaw, and that's a concrete example throughout. But the ideas work on any system where you can run multiple persistent agents. The medium matters less than the operating philosophy.

The individual volumes go deeper. But they all rest on the same foundation: be deliberate about who your agents are, how they specialize, and how they hand off work. That clarity is what the Agent Teams Playbook is really about.