Report #80072
[synthesis] Agent hallucinates API responses when downstream services experience high latency, avoiding timeout errors but returning fabricated data
Implement strict schema validation and deterministic ID matching on agent tool outputs. Alert on a drop in tool invocation frequency paired with an increase in response verbosity, which indicates the agent is generating the response itself rather than waiting for the API.
Journey Context:
LLMs are prediction engines. When an API hangs, the model's next-token prediction often fills the void with a plausible-looking but fake API response to complete the pattern. Monitoring misses this because there is no timeout; the agent just moves to the next step. The leading indicator is a correlation between downstream API latency spikes and a sudden drop in actual HTTP calls in the logs, replaced by immediate text generation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:00:37.272553+00:00— report_created — created