OVTH / 2026 LIVE · V0.4.0
Ø Overthinking Gateway ↗

Hermes-style multi-agent workflows

One orchestrator, many workers, clean context boundaries. The delegate_task pattern that turns long sessions into parallel sprints without context explosion.

Updated May 08, 2026 by xlrd · Fig. S04

Who this is for

You have hit the context wall. A single agent session starts coherent, drifts after 30 tool calls, loses the original goal by call 60, and rereads the same file four times by call 100. You want to ship features that take hours, not sessions that take hours.

You also want to stop watching the model think. Parallel workers mean one finishes while another starts; you review, not wait.

The pattern, in one sentence

The orchestrator decides what to do. Workers do it. Results are summaries, not transcripts.

How it works

The orchestrator

One long-lived session that holds the high-level goal. It:

  • Reads the repo layout once
  • Writes a plan: N tasks, clear scope per task, dependency order
  • Spawns workers via delegate_task (or shell-launched subagents)
  • Reviews returned summaries
  • Commits, moves on

The orchestrator never edits a single line of code itself. It routes, reviews, and decides.

The workers

Short-lived sessions with narrow briefs. A worker:

  • Receives one paragraph of context + one paragraph of goal
  • Has full tool access inside a sandboxed path (a git worktree is ideal)
  • Runs to completion — usually 5–15 minutes
  • Returns a compact structured summary: files touched, commits made, tests passing, anything surprising

Workers do not know the larger plan. They do not need to. Their loss of context is a feature — they cannot drift because they cannot see the horizon.

Tools and versions

  • Hermes Agent (Nous Research) with delegate_task tool enabled, or any orchestrator that supports sub-agent spawning
  • Claude Code 1.2+ as a worker runtime (claude CLI)
  • MCP for shared capabilities across workers (filesystem, git, tests)
  • tmux (one pane per worker) or git worktree (one directory per worker)

Setup in five steps

01. Decide the dispatch shape

Two patterns work. Pick one per project.

  • Parallel-safe — workers operate on disjoint paths. Use git worktree so they never collide.
  • Sequential with handoff — worker N returns an artifact worker N+1 needs. Orchestrator serializes them.

02. Create isolated worktrees for parallel workers

bash
cd ~/projects/app
git worktree add ../app-worker-1 feature/payments
git worktree add ../app-worker-2 feature/notifications
git worktree add ../app-worker-3 feature/admin-ui

Each worker gets a directory. They cannot step on each other. Reviews happen as branch merges, not file conflicts.

03. Write a worker brief template

Keep it boring. Boring briefs produce predictable results.

markdown
# Worker brief

## Context
[one paragraph — what is the surrounding system]

## Goal
[one paragraph — what done looks like, measurable]

## Constraints
- Do not touch files outside src/payments/
- Tests must pass: pnpm test payments
- Commit style: conventional, scope = payments

## Return
- List of files changed
- Test output (last 20 lines)
- Anything surprising (< 5 bullets)

Paste this into every delegate_task call, fill the top, leave the rest.

04. Dispatch and review in a loop

bash
# Orchestrator pseudocode — how Hermes does it
for task in plan:
result = delegate_task(
  brief=render(template, task),
  workdir=task.worktree,
  timeout=900,  # 15min cap
  notify_on_complete=True,
)
if result.status != "ok":
  adjust_plan(result)
else:
  merge_worktree(task.worktree)

In practice this is one MCP tool call per task. The orchestrator sits in an event loop, dispatching and merging.

05. Keep the orchestrator context clean

Never paste worker transcripts back into the orchestrator. Only the summary. If a worker writes 2000 lines of reasoning, the orchestrator sees:

yaml
status: ok
files_changed: 7
tests: pass (42/42)
commits: [feat(payments): retry with jitter, test(payments): jitter property tests]
notable:
- Found existing retry util in src/utils/http.ts — reused it
- Skipped idempotency key for now (needs schema change)

Ten lines. The orchestrator stays sharp for fifty of these. Your context budget is a resource. Spend it on planning, not on replaying.

Cost, privacy, performance

Multi-agent work is not a scaling strategy. It is a context strategy. You are not making the model smarter — you are making the session shorter. Short sessions think better. Ten short sessions think better than one long one. That is the whole insight.