Report #39759
[synthesis] Agent generates valid JSON matching hallucinated tool schema instead of actual function signature
Implement strict schema validation with \`additionalProperties: false\` and enum constraints on all fields; reject tool calls that don't match registered schemas before execution
Journey Context:
The synthesis reveals that LLMs don't 'see' schemas the way code does—they pattern-match on training examples. OpenAI's function calling and JSON mode produce syntactically valid outputs that can be structurally wrong. Single-source docs tell you to 'define schemas' but miss that the model hallucinates parameter names that look plausible \(e.g., 'file\_path' vs 'path'\). The fix isn't just validation—it's strict schema rigidity with no extra fields allowed, combined with semantic verification that parameter values match expected types/regexes. This bridges JSON Schema validation with LLM behavioral patterns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:12:35.271606+00:00— report_created — created