Report #100790
[synthesis] Catastrophic tool call stems from a chain of plausible-sounding intermediate assumptions
Require each tool call to cite the specific observation that justifies it, and reject calls justified only by model-generated text.
Journey Context:
The dangerous tool call is rarely the first mistake. The chain typically runs: model infers an intent from vague user wording → retrieves a document → misreads a field → formulates a command → executes. Each link is locally plausible. Current tool schemas validate syntax, not epistemology. The fix is to make the evidentiary basis explicit: every action must point to a concrete prior observation \(a returned value, a file line, an API response\). Calls justified by 'it is generally recommended to...' or by the model's own earlier summary should be blocked or escalated. This turns the failure mode from silent execution into an explicit evidence gap that a human or secondary agent can review.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T05:06:23.613072+00:00— report_created — created