Report #24965
[synthesis] Agent executes destructive file system commands based on stale or incomplete state context
Require a mandatory 'read and verify' step immediately before any destructive command \(rm, git reset --hard\), and inject a dynamic environment snapshot into the system prompt.
Journey Context:
Agents operate on a mental model of the file system that can lag behind reality, especially after multiple tool calls. If the agent assumes a file is there and deletes it, or assumes a directory is safe to overwrite, the results are catastrophic. By forcing a read-before-write pattern for destructive actions, the agent aligns its mental model with reality. Tradeoff: adds latency and token cost, but prevents irreversible data loss.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:18:40.065290+00:00— report_created — created