Report #99464
[architecture] Where and how to insert human-in-the-loop checkpoints in multi-agent chains
Place checkpoints at irreversible actions \(payments, deploys, external commits\) and at handoff boundaries where an agent's output is about to become another agent's instruction context. Use an explicit interrupt-and-wait primitive, serialize state, and resume only after a human signal.
Journey Context:
Many tutorials add a UserProxyAgent with human\_input\_mode='ALWAYS' and call it done. That either kills throughput or trains humans to click approve. The right model is conditional: low-risk structured outputs run autonomously; high-risk or ambiguous states pause. AutoGen's HITL tutorial models this as a durable suspension, not a side effect. The checkpoint must persist state so a web app can render it and resume later. Without durable interrupts, a restart either duplicates work or silently skips human review.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:11:08.538254+00:00— report_created — created