Hermes-style multi-agent workflows
One orchestrator, many workers, clean context boundaries. The delegate_task pattern that turns long sessions into parallel sprints without context explosion.
Who this is for
You have hit the context wall. A single agent session starts coherent, drifts after 30 tool calls, loses the original goal by call 60, and rereads the same file four times by call 100. You want to ship features that take hours, not sessions that take hours.
You also want to stop watching the model think. Parallel workers mean one finishes while another starts; you review, not wait.
The pattern, in one sentence
The orchestrator decides what to do. Workers do it. Results are summaries, not transcripts.
How it works
The orchestrator
One long-lived session that holds the high-level goal. It:
- Reads the repo layout once
- Writes a plan: N tasks, clear scope per task, dependency order
- Spawns workers via
delegate_task(or shell-launched subagents) - Reviews returned summaries
- Commits, moves on
The orchestrator never edits a single line of code itself. It routes, reviews, and decides.
The workers
Short-lived sessions with narrow briefs. A worker:
- Receives one paragraph of context + one paragraph of goal
- Has full tool access inside a sandboxed path (a git worktree is ideal)
- Runs to completion — usually 5–15 minutes
- Returns a compact structured summary: files touched, commits made, tests passing, anything surprising
Workers do not know the larger plan. They do not need to. Their loss of context is a feature — they cannot drift because they cannot see the horizon.
Tools and versions
- Hermes Agent (Nous Research) with
delegate_tasktool enabled, or any orchestrator that supports sub-agent spawning - Claude Code 1.2+ as a worker runtime (
claudeCLI) - MCP for shared capabilities across workers (filesystem, git, tests)
- tmux (one pane per worker) or git worktree (one directory per worker)
Setup in five steps
01. Decide the dispatch shape
Two patterns work. Pick one per project.
- Parallel-safe — workers operate on disjoint paths. Use
git worktreeso they never collide. - Sequential with handoff — worker N returns an artifact worker N+1 needs. Orchestrator serializes them.
02. Create isolated worktrees for parallel workers
cd ~/projects/app
git worktree add ../app-worker-1 feature/payments
git worktree add ../app-worker-2 feature/notifications
git worktree add ../app-worker-3 feature/admin-ui Each worker gets a directory. They cannot step on each other. Reviews happen as branch merges, not file conflicts.
03. Write a worker brief template
Keep it boring. Boring briefs produce predictable results.
# Worker brief
## Context
[one paragraph — what is the surrounding system]
## Goal
[one paragraph — what done looks like, measurable]
## Constraints
- Do not touch files outside src/payments/
- Tests must pass: pnpm test payments
- Commit style: conventional, scope = payments
## Return
- List of files changed
- Test output (last 20 lines)
- Anything surprising (< 5 bullets) Paste this into every delegate_task call, fill the top, leave the rest.
04. Dispatch and review in a loop
# Orchestrator pseudocode — how Hermes does it
for task in plan:
result = delegate_task(
brief=render(template, task),
workdir=task.worktree,
timeout=900, # 15min cap
notify_on_complete=True,
)
if result.status != "ok":
adjust_plan(result)
else:
merge_worktree(task.worktree) In practice this is one MCP tool call per task. The orchestrator sits in an event loop, dispatching and merging.
05. Keep the orchestrator context clean
Never paste worker transcripts back into the orchestrator. Only the summary. If a worker writes 2000 lines of reasoning, the orchestrator sees:
status: ok
files_changed: 7
tests: pass (42/42)
commits: [feat(payments): retry with jitter, test(payments): jitter property tests]
notable:
- Found existing retry util in src/utils/http.ts — reused it
- Skipped idempotency key for now (needs schema change) Ten lines. The orchestrator stays sharp for fifty of these. Your context budget is a resource. Spend it on planning, not on replaying.
Cost, privacy, performance
Related flash tutorials
- Multi-agent orchestration — the tactical version
- Claude Code setup — install the worker runtime
- MCP server setup — shared tools across workers
Multi-agent work is not a scaling strategy. It is a context strategy. You are not making the model smarter — you are making the session shorter. Short sessions think better. Ten short sessions think better than one long one. That is the whole insight.