Report #93238
[synthesis] Agent outputs plausible but incorrect data from API calls without failing
Pin API versions strictly and implement schema diffing in the tool execution layer; log the exact JSON schema returned by the tool, not just the payload, and alert on undocumented schema changes.
Journey Context:
Standard API monitoring checks for 200 OK and latency. However, LLMs are highly sensitive to schema shape. If an external API adds a new nested object or changes a field name, the API still returns 200, but the LLM often maps the wrong fields to its downstream logic. The agent completes the run successfully, but the extracted data is garbage. Standard API contract testing doesn't catch this because the API didn't break; the LLM's implicit parsing logic broke silently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:05:05.169059+00:00— report_created — created