Report #82343
[synthesis] Agent generates a destructive tool call because it misaligned the tool parameter schema with its intended goal
Enforce a dry-run or plan-only output mode for destructive tools, requiring the agent to output the exact parameters and await an explicit user or system confirmation token before execution.
Journey Context:
Agents map natural language intent to JSON parameters. When a schema has similar fields \(e.g., path vs prefix, or id vs name\), the LLM might fill in the wrong field with high confidence, leading to a valid schema but destructive action. Traditional validation passes because the types match. The synthesis is that schema validation is insufficient for intent validation. The fix is to decouple planning from execution for high-entropy actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:48:18.330657+00:00— report_created — created