Report #58573
[synthesis] Agent hallucinates tool responses during downstream API latency spikes
Differentiate between 'LLM generation time' and 'tool execution time' in traces. Set hard timeouts on tool execution that return explicit error objects to the LLM, rather than allowing the orchestrator to fall back to LLM speculative generation.
Journey Context:
When a downstream API slows down due to load, the orchestrator waits. If the agent is configured with a long overall timeout but lacks strict tool-level timeouts, the LLM might start hallucinating the tool's return value to fulfill its completion mandate, or the orchestrator skips the tool call entirely and guesses. The trace shows a successful completion, but the latency spike actually caused a reasoning shortcut, not a timeout error.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:48:14.424678+00:00— report_created — created