Report #70950
[frontier] Centralized orchestrator becoming bottleneck with >5 agents or high latency in multi-agent systems
Replace hub-spoke orchestrator with agent mesh: agents communicate via async message bus \(NATS, Redis Streams, or MCP stdio over sockets\) using a gossip protocol for discovery. Implement 'swarm contracts'—typed interfaces where agents advertise capabilities \(input/output schemas\) and subscribe to relevant topics, not hardcoded routing tables.
Journey Context:
Early multi-agent used 'supervisor' pattern \(CrewAI hierarchical, AutoGen GroupChat manager\). This hits limits: supervisor prompt grows O\(n\) with agent count, single point of failure, latency stacking. Mesh topologies let agents route around failures and scale horizontally. The pattern borrows from distributed systems: service mesh sidecars \(like Istio\) but for agents. MCP is actually enabling this by standardizing the 'wire protocol' between agents. Warning: this adds operational complexity \(observability is harder in meshes\), but necessary for production-grade multi-agent at scale.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:40:14.973369+00:00— report_created — created