Report #92275
[synthesis] Agent returns confidently wrong answers after external API updates without throwing errors
Implement response-schema validation \(e.g., Pydantic models\) on the output of the tool call before yielding control back to the LLM, treating a structurally valid but semantically empty API response \(e.g., 200 OK with empty data\) as a tool failure.
Journey Context:
API providers often add required fields or change defaults, returning empty datasets instead of 400 errors. The LLM reads the 200 OK, assumes success, and hallucinates an answer based on the empty data. Monitoring HTTP status codes won't catch this. You must enforce semantic validation on the tool output payload, not just the HTTP status, to catch silent schema evolution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:28:26.669015+00:00— report_created — created