Who this is for

You have hit the context wall. A single agent session starts coherent, drifts after 30 tool calls, loses the original goal by call 60, and rereads the same file four times by call 100. You want to ship features that take hours, not sessions that take hours.

You also want to stop watching the model think. Parallel workers mean one finishes while another starts; you review, not wait.

The pattern, in one sentence

The orchestrator decides what to do. Workers do it. Results are summaries, not transcripts.

How it works

The orchestrator

One long-lived session that holds the high-level goal. It:

Reads the repo layout once
Writes a plan: N tasks, clear scope per task, dependency order
Spawns workers via delegate_task (or shell-launched subagents)
Reviews returned summaries
Commits, moves on

The orchestrator never edits a single line of code itself. It routes, reviews, and decides.

The workers

Short-lived sessions with narrow briefs. A worker:

Receives one paragraph of context + one paragraph of goal
Has full tool access inside a sandboxed path (a git worktree is ideal)
Runs to completion — usually 5–15 minutes
Returns a compact structured summary: files touched, commits made, tests passing, anything surprising

Workers do not know the larger plan. They do not need to. Their loss of context is a feature — they cannot drift because they cannot see the horizon.

Tools and versions

Hermes Agent (Nous Research) with delegate_task tool enabled, or any orchestrator that supports sub-agent spawning
Claude Code 1.2+ as a worker runtime (claude CLI)
MCP for shared capabilities across workers (filesystem, git, tests)
tmux (one pane per worker) or git worktree (one directory per worker)

Setup in five steps

01. Decide the dispatch shape

Two patterns work. Pick one per project.

Parallel-safe — workers operate on disjoint paths. Use git worktree so they never collide.
Sequential with handoff — worker N returns an artifact worker N+1 needs. Orchestrator serializes them.

02. Create isolated worktrees for parallel workers

bash

cd ~/projects/app
git worktree add ../app-worker-1 feature/payments
git worktree add ../app-worker-2 feature/notifications
git worktree add ../app-worker-3 feature/admin-ui

Each worker gets a directory. They cannot step on each other. Reviews happen as branch merges, not file conflicts.

03. Write a worker brief template

Keep it boring. Boring briefs produce predictable results.

markdown

# Worker brief

## Context
[one paragraph — what is the surrounding system]

## Goal
[one paragraph — what done looks like, measurable]

## Constraints
- Do not touch files outside src/payments/
- Tests must pass: pnpm test payments
- Commit style: conventional, scope = payments

## Return
- List of files changed
- Test output (last 20 lines)
- Anything surprising (< 5 bullets)

Paste this into every delegate_task call, fill the top, leave the rest.

04. Dispatch and review in a loop

bash

# Orchestrator pseudocode — how Hermes does it
for task in plan:
result = delegate_task(
  brief=render(template, task),
  workdir=task.worktree,
  timeout=900,  # 15min cap
  notify_on_complete=True,
)
if result.status != "ok":
  adjust_plan(result)
else:
  merge_worktree(task.worktree)

In practice this is one MCP tool call per task. The orchestrator sits in an event loop, dispatching and merging.

05. Keep the orchestrator context clean

Never paste worker transcripts back into the orchestrator. Only the summary. If a worker writes 2000 lines of reasoning, the orchestrator sees:

yaml

status: ok
files_changed: 7
tests: pass (42/42)
commits: [feat(payments): retry with jitter, test(payments): jitter property tests]
notable:
- Found existing retry util in src/utils/http.ts — reused it
- Skipped idempotency key for now (needs schema change)

Ten lines. The orchestrator stays sharp for fifty of these. Your context budget is a resource. Spend it on planning, not on replaying.

Multi-agent is not free. Each worker spin-up costs 10–30 seconds of context load and warm-up. Dispatch is useful when:

The task is self-contained — you can brief it in under 200 words
The task will take more than 5 minutes if you did it inline
You have at least two such tasks that can run in parallel
A clear acceptance signal exists (tests pass, command returns zero)

Dispatch is harmful when:

You are still discovering the shape of the problem. Discovery work needs the long context, not a worker that forgets what you learned.
The task is under-specified. Workers hallucinate finish lines. A bad brief + autonomy = a confident wrong diff.
Files are entangled. Two workers editing adjacent code produces merge hell even with worktrees.
You need to steer mid-flight. Workers are fire-and-forget. If you want to course-correct in real time, use a single interactive session.

The cognitive cost: you trade “watching one agent” for “orchestrating N agents.” The orchestrator is a new job. If you are not ready to be a manager of models, stay with a single interactive session. The dual-driver Claude Code + Cursor stack handles most projects without any dispatch at all.

Our honest rule: we use delegation for any task that has three or more parallel-safe units of work and a week of runway. Below that, it is overhead.

Cost, privacy, performance

Multi-agent orchestration — the tactical version
Claude Code setup — install the worker runtime
MCP server setup — shared tools across workers

Multi-agent work is not a scaling strategy. It is a context strategy. You are not making the model smarter — you are making the session shorter. Short sessions think better. Ten short sessions think better than one long one. That is the whole insight.