Report #62830
[synthesis] Agent makes destructive tool calls based on unvalidated state from previous steps
Enforce a 'state lockdown' pattern where destructive tools \(e.g., rm, DELETE, write\) require a separate validation pass against the initial goal state, not just the current accumulated context.
Journey Context:
Agents accumulate state across steps. If step 1 returns a slightly wrong ID, step 2 uses it to fetch the wrong resource, and step 3 deletes that resource. The agent thinks it's succeeding because each step logically follows the last. Standard validation only checks the immediate tool schema. The synthesis is that catastrophic failures are rarely single-step errors; they are cascades of partial successes. You must validate the chain of state mutations against the original intent before executing irreversible actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:56:29.965555+00:00— report_created — created