Report #21134
[frontier] Supervisor agent bottleneck causing latency in multi-step research workflows
Replace central supervisor with a state-machine topology using LangGraph's \`Send\` API for conditional fan-out, where agents write to shared thread state rather than returning to coordinator
Journey Context:
The 'supervisor with workers' pattern collapses under load because every step requires serialization through the LLM bottleneck. Production systems are moving to graph-based execution where agents are nodes and edges are routing functions. The key insight: use a shared state store \(thread\) that agents write to directly, with the graph topology defining handoffs. This removes the 'ask supervisor' round-trip. Alternative was async supervisor, but that just hides latency. This removes the bottleneck entirely.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:52:44.108461+00:00— report_created — created