Building MCS: A Coordination Server for AI Agents
I built a coordination server for my AI agents. Five independent AI systems — running different models on different machines — needed a way to share work, communicate, and collaborate on tasks too large for any single agent. The result is the Mesh Coordination Server (MCS): a lightweight task queue and shared memory system that turns isolated AI agents into a coordinated mesh.
This post walks through the architecture, the design decisions, and a real-world case study where all five agents worked in parallel to produce a comprehensive code review of a 34,000-line TypeScript codebase.
The Problem: Isolated Agents
I run five AI agents across my home infrastructure:
- Paisley — orchestrator agent (Claude Code, task coordination, blog publishing)
- Ocasia — security-focused agent (qwen3.5:397b, CLI-first, direct feedback)
- Rex — implementation-focused agent (devstral-2:123b, practical robustness)
- Phil — code-quality agent (qwen3-coder:480b, optimization and patterns)
- Molly — general-purpose agent (qwen3.5:397b, research and communication)
Each agent runs on its own hardware with its own model. They can each do impressive work independently — but without coordination, they're just five separate tools I have to manage manually. The question was: how do you get five AI agents with different capabilities, running different models on different machines, to collaborate on a single task?
The Solution: MCS Architecture
MCS is intentionally simple. It's a Bun/TypeScript HTTP server backed by SQLite, deployed as a single service. No Kubernetes, no message brokers, no distributed databases. Just a server that does two things well:
- Shared Task Queue — Submit work, route it to capable agents, track completion
- Shared Memory Store — Key-value storage with namespaces, so agents can share state
System Overview
The Task Queue
The task queue is the core of MCS. Any agent can submit a task, and MCS routes it to the right agent based on capability matching. Here's how it works:
- Submit — A task is created with a type, priority, and required capabilities
- Route — MCS checks which agents have registered the required capabilities
- Notify — The matched agent receives a webhook push notification
- Claim — The agent claims the task (with a TTL to prevent stale claims)
- Execute — The agent does the work
- Complete — Results are posted back to MCS
Key design decisions:
- Capability-based routing — Agents register capabilities like
mcs-review,shell,web-search,gpu. When a task requires specific capabilities, only agents with those capabilities are considered. - Fanout mode — Setting
route="all"creates parallel child tasks for every capable agent. This is how the code review works: one submission fans out to all five agents simultaneously. - Claim TTL + Watchdog — When an agent claims a task, it gets a 5-minute window. A watchdog process runs every 30 seconds to reclaim expired tasks and retry them. This handles agent crashes gracefully.
- Exponential backoff retry — Failed tasks retry with a 10-second base delay, doubling up to 600 seconds, with 3 retries by default.
- Priority dispatch — Tasks are urgent, normal, or low priority. Urgent tasks jump the queue.
Shared Memory
The second component is a namespaced key-value store. Agents use it to share context:
meshnamespace — Read/write for all agents. Shared state like agent statuses, configuration, and coordination data.agent:NAMEnamespace — The owning agent can write; all agents can read. Agent-specific state visible to the mesh.private:NAMEnamespace — Only the owning agent can read or write. Private scratch space.
Keys support TTLs, tags, and bulk operations. The memory store is backed by the same SQLite database, keeping the deployment footprint minimal.
Agent Registration
Agents register with MCS via a heartbeat loop. Every 4 minutes, each agent re-registers its capabilities (within the 5-minute TTL). Registration includes:
- Capabilities list — What this agent can do (e.g.,
filesystem,shell,web-search,gpu,mcs-review) - Notify URL — Where MCS should push task notifications (webhook endpoint)
- Auth credentials — Each agent has a unique secret for API authentication
When an agent misses two heartbeat cycles, MCS marks it offline and stops routing tasks to it. No manual intervention needed — agents come and go naturally.
Notification Flow
When a task is submitted, MCS doesn't wait for agents to poll. It pushes a webhook notification to the matched agent's registered URL. The agent receives the notification, claims the task, processes it, and posts the result back. The entire flow is push-based — no polling loops, no wasted cycles.
Case Study: 5-Agent Parallel Code Review

To demonstrate MCS in action, let's walk through a real task: a comprehensive code review of cf-cli, a 34,000-line TypeScript CLI tool wrapping the Cloudflare API with 400+ commands.
No single reviewer can catch everything. Different models have different strengths — one excels at security analysis, another at finding correctness bugs, another at architectural patterns. The goal: run five independent reviews in parallel and synthesize the results.
How It Works
Step 1: Paisley gathers the code. The orchestrator agent collects the source files — core infrastructure, representative commands, tests, and configuration — into a single 418KB payload. This is uploaded to a shared GitHub repository where all agents can fetch it.
Step 2: Five agents launch in parallel. Two agents (Gemini and Claude) run as local sub-agents. Three agents (Ocasia, Rex, Phil) receive their tasks via MCS:
bun run mcs-client.ts task submit \
--type mcs-review \
--route ocasia \
--payload-file /tmp/review-payload.jsonEach MCS task contains a review prompt tailored to the agent's strength and a URL pointing to the code payload. MCS pushes a webhook notification to each agent. They fetch the code, review it using their own model, and post results back.
Step 3: Results are synthesized. Paisley collects all five reviews and cross-references findings. When multiple agents independently flag the same issue, it gets a "consensus" tag — higher confidence that it's a real problem.
The Results
Five agents, five different models, five independent perspectives. Here's what they found:
| Agent | Model | Critical | Recommendations | Observations |
|---|---|---|---|---|
| Gemini | gemini-2.5-pro | 7 | 12 | 7 |
| Claude | claude-opus-4-6 | 4 | 8 | 6 |
| Ocasia | qwen3.5:397b | 6 | 8 | 0 |
| Rex | devstral-2:123b | 0 | 5 | 0 |
| Phil | qwen3-coder:480b | 0 | 5 | 0 |
After deduplication and synthesis: 38 unique findings — 8 critical, 18 recommendations, 12 observations. Twelve findings had multi-agent consensus (flagged independently by 2 or more agents).
The top consensus issues:
- Retry-After header ignored on 429 responses (3 agents) — The HTTP client uses fixed backoff instead of respecting the server's rate-limit header.
- Secret values accepted as CLI arguments (2 agents) — Plaintext secrets visible in shell history and
ps aux. - Unbounded pagination loop (2 agents) — No maximum page guard on the auto-pagination helper.
- Inconsistent URL path encoding (2 agents) — Some path segments encoded, others not.
- Config read silently swallows permission errors (2 agents) — Falls back to defaults without warning.
The value of multi-agent review isn't just more findings — it's confidence through consensus. When Gemini, Claude, and Ocasia all independently flag the same 429 retry issue from three different analysis angles, you know it's real.
Download the full 5-agent review report (interactive HTML)
Design Decisions
Why SQLite?
MCS uses a single SQLite database in WAL (Write-Ahead Logging) mode. For a system coordinating five agents with tens of tasks per day, SQLite is massively overprovisioned — and that's the point. No connection pools, no configuration, no operational overhead. The database is a single file that can be backed up with cp. WAL mode gives concurrent read access while writes are serialized, which is perfect for the task queue pattern.
Why Push, Not Pull?
Each agent registers a webhook URL with MCS. When a task matches an agent's capabilities, MCS immediately pushes a notification — no polling interval, no wasted cycles. This means task dispatch latency is measured in milliseconds, not polling intervals. Agents that are offline simply don't have a registered URL, so MCS skips them naturally.
Why Capability-Based Routing?
Rather than hardcoding "send code reviews to Ocasia," MCS routes based on declared capabilities. An agent registers mcs-review as a capability. When a task requires mcs-review, any agent with that capability is eligible. This means:
- New agents can join the mesh by registering the right capabilities
- Agents can go offline without breaking task routing
- Capabilities can be added or removed dynamically (5-minute TTL)
- The orchestrator doesn't need to know which agent handles what — just what needs to happen
Why Not a Message Broker?
I considered RabbitMQ, Redis Streams, and NATS. But MCS coordinates five agents, not five hundred. The operational complexity of running a message broker — even a lightweight one — outweighs the benefits at this scale. A Bun HTTP server with SQLite starts in milliseconds, uses single-digit megabytes of RAM, and requires zero configuration. When you're building personal infrastructure, simplicity is a feature.
The Implementation
MCS is roughly 9,750 lines of TypeScript across 38 files. The core components:
server.ts— Bun HTTP server with route matchingroutes/tasks.ts— Task CRUD, claiming, results, and audit trailroutes/agents.ts— Agent registration, capability management, heartbeatroutes/memory.ts— Namespaced key-value store with ACLsdispatch/dispatcher.ts— Capability matching, priority routing, fanoutnotify/notifier.ts— Webhook push notifications to agentsauth.ts— Per-agent secret authenticationclient/mcs-client.ts— CLI tool for interacting with MCS from any machine
Authentication is straightforward: each agent has a unique secret. Requests include X-Agent-ID and X-Agent-Secret headers. No OAuth, no JWT rotation — just shared secrets appropriate for a trusted internal mesh.
Observability
MCS exposes a /metrics endpoint in Prometheus format. A Prometheus instance scrapes it every 15 seconds, and Grafana dashboards show:
- Task throughput (submitted, completed, failed per hour)
- Agent availability (heartbeat status, last seen)
- Queue depth by priority level
- Claim expiration and retry rates
- Memory store key counts by namespace
For immediate operational alerts, permanent task failures trigger a Telegram notification to a shared group where all agents are members.
Lessons Learned
SSH tunnels need keepalive. The MacBook running Paisley isn't on the Tailscale mesh directly — it reaches MCS through an SSH tunnel. Without ServerAliveInterval, tunnels silently die when the TCP connection goes idle. The process stays alive as a zombie while all connections through it fail. I lost an entire round of MCS reviews to this before adding keepalive to the tunnel LaunchAgent. Lesson: always set ServerAliveInterval=30 and ServerAliveCountMax=3 on persistent SSH tunnels.
Skill deployment paths vary across hosts. OpenClaw (the agent framework) loads skills from different directories depending on how it was installed. On macOS it checks ~/.openclaw/skills/; on one Linux host it looked in /usr/lib/node_modules/openclaw/skills/. Same version, different behavior. I burned 15 minutes wondering why Phil couldn't find the review skill before checking the logs.
Different models find different bugs. This is the core insight. Gemini excelled at finding edge cases in validation logic (IPv6, negative numbers, YAML escaping). Claude focused on architectural patterns and security implications (secret handling, URL encoding consistency). Ocasia caught input validation gaps that neither found. Rex and Phil, running smaller models, validated that the architecture was sound — high praise from a different angle. No single model found everything.
Consensus is a signal. When three models independently flag the same issue, it's almost certainly a real problem. When only one model flags something, it might be a false positive or a niche concern. Multi-agent consensus is a natural confidence metric that emerges for free from parallel review.
What's Next
MCS is already handling code reviews, but the architecture supports any task type. Next steps:
- Research tasks — Fan out research questions to multiple agents, each searching different sources
- Deployment coordination — Multi-step deployments where each agent handles a different stage
- Scheduled work — Cron-like task submission for recurring maintenance tasks
- Result aggregation — Automatic synthesis of fanout results (currently done by the orchestrator)
The beauty of capability-based routing is that new task types don't require MCS changes — just agents that register the right capabilities. The coordination layer stays simple while the capabilities of the mesh grow.
MCS is open infrastructure — a lightweight coordination layer that turns independent AI agents into a collaborative mesh. The code review case study demonstrates the core value proposition: five models, five perspectives, one synthesized result that's better than any individual review. The architecture is deliberately simple because at this scale, simplicity is the feature that matters most.