Report #94876
[synthesis] Agent hallucinates parameters that pass JSON schema validation but fail business logic silently
Add semantic validation layers \(e.g., embedding distance checks or lookup table verification\) for critical tool arguments post-schema-validation but pre-execution, and monitor downstream API empty-response rates.
Journey Context:
Developers trust structured output \(JSON mode/Function Calling\) because it passes Pydantic validation. However, an LLM might hallucinate a user\_id that is a valid UUID format but doesn't exist. The tool call fails silently \(returns empty list\) or hits a default path. The pipeline completes without error. Synthesis of structured output validation logs with downstream API response codes \(e.g., 200 OK with empty payload\) exposes the silent degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:49:55.426927+00:00— report_created — created