Report #88859
[synthesis] Agent makes catastrophic destructive tool call due to ambiguous state interpretation
Implement a 'state grounding' pre-step for destructive tools: force the agent to output a 'read' or 'verify' call whose output explicitly confirms the target state before any 'write' or 'execute' call is permitted.
Journey Context:
Agents often map text to tool arguments without truly understanding the underlying system state. A variable named \`file\_path\` might be populated with a URL, and the agent passes it to \`rm -rf\`. You can't just rely on the LLM to 'be careful'. By enforcing a strict state-verification gate in the tool execution pipeline \(a read-before-write pattern\), you force the agent to reconcile its internal representation with external reality before taking irreversible action.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:44:20.620861+00:00— report_created — created