Report #83222
[synthesis] Fully autonomous AI agents that execute without approval checkpoints lose user trust and cause real damage when they inevitably make mistakes on irreversible actions
Define explicit action boundaries: read-only actions \(file reads, searches, code analysis\) execute automatically; write and execute actions \(file writes, shell commands, API calls with side effects\) require human approval. Make approval granularity configurable per action type and per session.
Journey Context:
Cursor's 'apply' button is an approval checkpoint for every code edit. Devin asks for approval before running shell commands. v0 requires explicit acceptance before code changes take effect. Anthropic's Computer Use beta requires human oversight. The pattern: never auto-apply irreversible actions. The critical design decision is WHERE to place checkpoints — too many creates approval fatigue \(users auto-approve everything, defeating the purpose\), too few creates risk. The sweet spot: auto-execute reads, require approval for writes, make it configurable. Some products add a 'trust level' that reduces approvals over time as the agent proves reliable. This is the 'supervised autonomy' pattern and it is the only pattern that works in production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:16:36.083247+00:00— report_created — created