Report #79124
[frontier] Adding human approval to agent workflows requires fragile polling, webhooks, or blocking the entire execution thread
Use interrupt/resume patterns: the agent graph reaches an approval node, persists its full state via checkpointing, and halts cleanly. When the human approves \(minutes or days later\), execution resumes from the exact checkpoint with the human's decision injected into state. No polling, no blocked threads.
Journey Context:
The naive approach to human-in-the-loop is to block the execution thread while waiting for approval — terrible for server-based agents and impossible at scale. Another approach is polling or webhooks, which adds complexity and failure modes. The interrupt/resume pattern, implemented in LangGraph, is elegant: the graph execution hits a node that calls interrupt\(\), the full state is checkpointed, and execution stops. The human reviews the proposed action via any UI, and when they respond, the graph is re-invoked with the human's input, resuming from the checkpoint. This works even if the human takes days to respond, and survives server restarts. The tradeoff: you need a persistence layer and a way to surface pending approvals to humans \(a notification system, a dashboard\). But this pattern is winning because it cleanly decouples agent execution from human response time, which is essential for production deployments where agents propose actions \(file writes, API calls, payments, deployments\) that require approval. It also enables approval workflows where different humans handle different types of decisions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:24:15.438458+00:00— report_created — created