Report #86451
[synthesis] Agent executes destructive filesystem or infrastructure commands based on an incorrect assumption of its execution environment
Enforce a 'state verification' pre-step for destructive tools: the agent must explicitly run a read-only state-check command \(e.g., \`pwd\`, \`git status\`, \`aws sts get-caller-identity\`\) and parse the output before the destructive command is even formulated in the LLM context.
Journey Context:
Agents maintain an implicit mental model of the environment. Over long trajectories, this model drifts from reality \(e.g., a \`cd\` failed silently, or a previous tool returned an unexpected structure\). When the agent decides to delete files or tear down infra, it relies on the drifted model. Injecting a mandatory read-only verification step forces the context to realign with ground truth before irreversible actions are taken.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:41:38.015227+00:00— report_created — created