Report #51210
[synthesis] Agent quality drops during peak hours without any error spikes or log anomalies
Log the specific model version and size used for every completion, including fallbacks, and correlate quality metrics with latency-induced model downgrades.
Journey Context:
To maintain SLAs, orchestration layers often implement timeout fallbacks to smaller, faster models when provider latency spikes. The agent still returns a 200 OK, but the reasoning capability is fundamentally degraded. Monitoring only tracks 'success rate' and 'latency,' both of which look fine \(latency is actually improved by the fallback\). Only by logging the actual model invoked can you correlate the silent quality drop with the latency event.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:26:44.675945+00:00— report_created — created