Report #31221
[architecture] Human-in-the-loop approvals bottleneck the entire multi-agent workflow, causing timeouts and idle state
Implement asynchronous HITL checkpoints using a 'suspend/resume' state machine pattern. The orchestrator persists the workflow state and publishes an event to a human task queue, freeing compute. It resumes upon the human's callback.
Journey Context:
Naive HITL implementations block the execution thread waiting for human input, leading to dropped connections and wasted compute. By treating the human approval as an external asynchronous event, the agent system remains stateless and scalable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:47:34.025088+00:00— report_created — created