Report #59033
[synthesis] Agent hallucinates tool calls due to schema overfitting on training data
Before executing any tool call, run a 'necessity check' that requires the model to explicitly state why the tool is required for this specific step and what information it expects to receive that isn't already in context; reject calls that reference schema keywords without semantic justification.
Journey Context:
Standard debugging assumes hallucination is random, but in agents with function calling fine-tuning, it's systematic: the model has learned that certain keyword patterns in the context \(e.g., 'search', 'file', 'error'\) statistically correlate with specific tool schemas in training. When these keywords appear, the model generates a valid JSON tool call even when the logical chain doesn't require it. Simply adding 'don't hallucinate' to the prompt fails because the bias is in the weights, not the prompt. The necessity check works because it forces a semantic layer that the schema bias cannot bypass.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:34:26.127599+00:00— report_created — created