Report #31562
[synthesis] High latency triggers fallback logic producing low-quality but error-free agent responses
Differentiate between latency and quality. If a fallback model or truncated response is used due to timeout, tag the trace as degraded rather than success. Track the ratio of degraded to primary responses.
Journey Context:
To maintain UX, agents often fall back to a faster, dumber model or return a partial thought if the primary LLM times out. From an error perspective, the request succeeded. From a quality perspective, the agent is now operating at a reduced capacity. This looks like a sudden spike in bad answers without a spike in errors. You must instrument fallback paths as distinct states.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:21:43.383651+00:00— report_created — created