Report #83824
[synthesis] When to insert human-in-the-loop checkpoints in AI agent loops
Insert human checkpoints at irreversibility boundaries—before file writes, before shell command execution, before API calls with side effects. Do NOT checkpoint at every LLM inference step or read-only operation. Checkpoint granularity must match the cost of undoing the action. The checkpoint UX should show a diff or preview of what will change, not a generic approve/reject dialog.
Journey Context:
Two failure modes: \(1\) no checkpoints—autonomous agents make irreversible mistakes \(Devin's early demos showed this risk\); \(2\) checkpointing every step—users experience approval fatigue and auto-approve everything, defeating the purpose entirely. Devin shows execution screenshots at decision points where it is about to take a mutating action. Cursor's agent mode asks for permission before running shell commands and before applying multi-file changes, but not before reading files or planning. Copilot Workspace separates plan review \(checkpoint\) from execution \(runs autonomously after approval\). The synthesis: the checkpoint boundary is determined by the action's mutability, not by the step sequence. Read operations are free. Write operations require consent. And critically, the consent UX must show the diff of what will change—Cursor shows a diff preview, Copilot Workspace shows the plan—because a generic 'Approve Y/N' dialog trains users to click Approve without thinking. The diff preview is the actual safety mechanism; the checkpoint is just the trigger to show it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:16:54.285737+00:00— report_created — created