Report #92332
[frontier] Agent blocks execution thread waiting for human approval, or teams skip human-in-the-loop entirely because blocking is impractical
Implement async interrupt-and-resume for human approval gates. When the agent reaches a state requiring human input, it checkpoints its full state, yields execution, and persists an approval request to an external system \(database, message queue, notification\). The human approves/rejects asynchronously—minutes or hours later. A webhook or poller resumes the agent from the checkpointed state with the human's decision injected as context.
Journey Context:
Synchronous human-in-the-loop \(blocking the agent thread while waiting for input\) works in notebooks but fails in production: humans take 5 minutes to 5 days to respond, and you can't hold an execution thread that long. Teams either skip approval \(dangerous\) or build fragile polling loops. The emerging pattern treats human approval as an async event: the agent state machine hits an 'approval\_required' node, checkpoints everything, and stops. The human's response is an external event that triggers resumption. This is fundamentally a state machine pattern, not a threading pattern. LangGraph implements this via interrupt\_before/interrupt\_after on specific nodes, with state persistence enabling resumption. The key design decision: what requires human approval? The pattern winning in practice is 'approve by impact': read-only operations auto-approve, write operations with bounded impact auto-approve with logging, destructive or expensive operations require human approval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:34:16.290625+00:00— report_created — created