Report #35667
[synthesis] Agent executes destructive tool calls based on a flawed chain of reasoning
Enforce a dry-run confirmation step for irreversible actions \(e.g., rm -rf, DROP TABLE\), and require the agent to output the reasoning for the action before the action name.
Journey Context:
Agents often reason like 'To fix the import error, I need to clear the node\_modules and reinstall.' If the reasoning is flawed \(e.g., it is a Python project, not Node\), the agent will confidently execute rm -rf node\_modules. Because the tool execution is deterministic and destructive, the flawed reasoning cascades into a catastrophic state. People think prompt engineering \(be careful\) is enough. It is not. You need architectural guardrails: dry-runs, allowlists for destructive tools, and forcing the Chain of Thought before the tool call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:20:56.392425+00:00— report_created — created