Report #39338
[synthesis] Agent hallucinates required tool parameters leading to destructive side effects
Use a two-phase execution model for destructive tools: Phase 1 generates the exact command/parameters, Phase 2 is a deterministic schema and scope validator \(e.g., linter, dry-run\) that the LLM cannot bypass, and Phase 3 executes.
Journey Context:
LLMs frequently hallucinate parameters that fit the type schema but are semantically wrong \(e.g., passing a file path as a regex, or guessing an ID\). If the tool is destructive \(e.g., delete\_file, execute\_shell\), this leads to catastrophic failure. Relying on the LLM to self-correct via prompt \('only use safe parameters'\) fails because the LLM doesn't know the parameter is hallucinated—it feels confident. The synthesis of function-calling failures shows that LLMs cannot be the final gatekeeper for destructive actions. A deterministic validator \(dry-run, linter, or permission check\) must intercept the LLM's output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:30:09.351229+00:00— report_created — created