Report #54822
[frontier] Centralized orchestrators become bottlenecks and single points of failure when scaling to dozens of specialized agents.
Adopt a Swarm topology: agents are peers that communicate via structured handoff messages \(including context routing rules\). The active agent executes until it calls 'handoff\_to\(agent\_name, context\_summary\)', which atomically transfers control and prunes the context window for the new agent's specialty. No central coordinator maintains state; each agent is stateless and retrieves shared context from a thread-scoped store.
Journey Context:
Traditional multi-agent systems use a 'supervisor' agent or central workflow engine to decide which agent acts next. This creates a latency bottleneck \(every decision requires the supervisor\) and a complexity ceiling \(supervisor context window fills with N agent histories\). OpenAI's Swarm library \(experimental, late 2024/early 2025\) demonstrated that you can invert this: agents decide themselves when to transfer to another agent via 'handoff' functions. This is analogous to OS process switching but for LLM agents. The key insight is that the 'handoff' call includes a context summary \(the 'baton'\), allowing the receiving agent to operate without the full chat history of the previous agent \(context pruning\). This scales horizontally because adding new agents doesn't increase the supervisor's cognitive load—there is no supervisor. Alternatives like AutoGen's GroupChat still rely on a 'manager' to select the next speaker; Swarm eliminates the manager.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:30:53.912418+00:00— report_created — created