Report #93115
[architecture] Central LLM orchestrator becomes throughput bottleneck and single point of failure
Use a thin deterministic orchestrator \(state machine or workflow engine, not an LLM\) for routing and state management. Reserve LLM calls for the worker agents that do actual cognitive work. The orchestrator should be a program, not a prompt.
Journey Context:
The natural architecture is one boss LLM that delegates to worker LLMs. This works for two or three agents but the orchestrator LLM becomes the bottleneck: every message passes through it, every routing decision costs a completion, and if it hallucinates a routing decision the whole pipeline breaks. The orchestrator also becomes a cost multiplier since it runs on every step. The fix is to make the orchestrator a deterministic state machine \(LangGraph, Temporal, Step Functions\) that uses LLMs only as tools. The orchestrator handles control flow; LLMs handle intelligence. Tradeoff: you lose the flexibility of an LLM deciding routing on the fly, but you gain reliability, observability, reproducibility, and throughput. You can always add an LLM-based router as a node in the graph when you need dynamic routing, but the graph itself remains deterministic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:52:56.852784+00:00— report_created — created