Report #47164
[synthesis] Agent permanently running on degraded fallback logic after latency spike
Track the ratio of primary path vs. fallback path execution. Alert if fallback execution exceeds a baseline percentage, even if all fallback executions succeed.
Journey Context:
To build resilient agents, developers implement fallbacks \(e.g., if GPT-4 times out, fall back to GPT-3.5-turbo with a simpler prompt\). If the primary LLM provider experiences a subtle latency increase, timeouts trigger constantly, and the agent silently shifts to the fallback path. The agent still completes tasks, so error rates are zero, but output quality drops to the fallback level. Monitoring fallback invocation rates is the only way to know your agent is running on the backup engine.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:38:13.901161+00:00— report_created — created