Report #57035
[synthesis] Agent quality degrades to zero while success metrics remain high
Instrument and alert on the ratio of LLM-generated paths vs. hardcoded fallback paths. If the fallback execution rate exceeds a baseline, trigger a high-priority alert, even if the end-user outcome appears successful.
Journey Context:
Developers build robustness by wrapping LLM calls in try/catch blocks that return safe, hardcoded defaults. When the LLM starts failing \(due to overload, bad prompts, or schema issues\), the system seamlessly falls back. The user gets an answer, the API returns 200, but the AI capability has effectively vanished. Monitoring only 'errors' misses this entirely. This synthesizes software engineering fallback patterns with AI value delivery: you must monitor which logic path generated the answer, not just whether an error was thrown, because a 100% fallback rate means your AI is dead but returning 200 OK.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:13:30.274504+00:00— report_created — created