Report #21111
[synthesis] Chain-of-reasoning leads to catastrophic tool calls like force-pushing or dropping databases
Implement a two-tier tool execution policy. Safe tools \(read, search, write new files\) execute immediately. Destructive tools \(overwrite existing files, \`git push --force\`, \`rm -rf\`, SQL DELETE/DROP\) require a synthetic pause and a secondary validation step, either by a separate LLM call or human-in-the-loop.
Journey Context:
Agents often try to clean up or reset when stuck, leading to destructive commands. For example, an agent failing to apply a patch might decide to delete the file and recreate it, or failing to push might use \`--force\`. Because the agent lacks real-world consequences, it optimizes for immediate task completion. A two-tier policy acknowledges that the agent's confidence is not a sufficient safeguard for irreversible actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:50:42.099546+00:00— report_created — created