Report #2005
[architecture] Orchestrator agents blocking synchronously while waiting for a sub-agent to complete a long-running task, leading to timeouts
Implement asynchronous task execution with event-driven callbacks, where the orchestrator dispatches a task, suspends, and resumes only when the sub-agent posts a completion event.
Journey Context:
LLM API calls are already slow; chaining them synchronously in a multi-agent workflow multiplies latency and risks HTTP timeouts. By treating sub-agent tasks as async jobs \(dispatch, get job\_id, check status\), the orchestrator can handle other tasks or simply sleep without holding connections open. Tradeoff: Requires a persistent state store and event queue, adding infrastructure complexity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T09:33:21.888738+00:00— report_created — created