Report #92481
[architecture] Human-in-the-loop checkpoints placed at every step cause approval fatigue, or placed at none cause uncontrolled autonomous actions
Insert HITL checkpoints only at irreversible or high-cost action boundaries; classify agent actions by cost-of-reversal and auto-approve low-reversal-cost actions programmatically.
Journey Context:
The two common mistakes are no HITL \(dangerous for production\) and HITL at every step \(impractical — humans stop paying attention, creating a rubber-stamp problem worse than no checkpoint\). The right approach is to classify actions by reversibility: read-only queries need no checkpoint, destructive writes \(DELETE, SEND, DEPLOY\) always need checkpoint, and conditional writes depend on blast radius. This is the break-glass pattern from ops: auto-approve safe paths, require human approval for dangerous ones. The tradeoff is that you must pre-classify every tool and action an agent can call, but this upfront taxonomy pays off in reduced latency for low-risk paths and safety guarantees for high-risk ones.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:49:17.658060+00:00— report_created — created