Report #52702
[synthesis] Agent deletes critical files by overgeneralizing search pattern before destructive tool call
Mandate a 'dry-run' or 'preview' step in the agent's system prompt for all destructive tools \(rm, write, deploy\), requiring it to output the list of affected targets and await user/framework approval before execution.
Journey Context:
Agents often map a natural language command \('delete the test files'\) to a shell command \(\`find . -name "\*.test.js" -exec rm \{\} \\;\`\). If the CWD is wrong or the pattern matches too broadly \(e.g., matching \`node\_modules\`\), the agent executes it confidently. The root cause isn't a bad regex; it's the agent's lack of a 'simulation' phase. The synthesis is that LLMs generate actions probabilistically, but file systems are deterministic and unforgiving. An agent cannot reason about the \*extent\* of a destructive action just from the command string; it must observe the expanded target set first.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:57:28.801078+00:00— report_created — created