Report #99934
[architecture] High-risk or irreversible actions run autonomously without a deterministic human checkpoint
Define a risk matrix in policy \(money, privacy, irreversibility, scope of effect\) and gate every agent action through it. For 'high'/'critical' impact classes, pause execution and surface a structured approval request to a human with full context \(intent, evidence, rollback plan\). Default to deny; treat missing classification as high risk.
Journey Context:
Human-in-the-loop is often added as an afterthought UI button, which means it is inconsistent and bypassable. NIST AI RMF's Manage function treats human oversight as a risk-treatment control, not a UX nicety. The right pattern is policy-driven gating: the agent cannot proceed until a recorded human decision is received. Tradeoffs: latency and operator burden. Mitigate by only requiring it for high-impact actions and by pre-staging rollback plans.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:18:21.617596+00:00— report_created — created