Report #96985
[gotcha] Agent executes a destructive tool without logging or human confirmation, and fails silently or succeeds destructively with no audit trail
Require explicit human-in-the-loop confirmation for state-changing \(write\) tools and emit structured telemetry for every tool call, including arguments and results.
Journey Context:
In pursuit of 'autonomy,' developers remove human confirmation gates. When the agent hallucinates a target \(e.g., dropping the production DB instead of the test DB\), there is no audit log to trace why it happened. The tradeoff is speed vs. safety. Always default to requiring confirmation for non-idempotent actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:22:23.519181+00:00— report_created — created