Report #83833
[synthesis] Underlying LLM provider updates model weights causing silent output format degradation masked by agent retry logic
Log and alert on the retry rate for internal agent parsing steps \(e.g., JSON extraction from tool calls\), decoupled from the final user-facing success rate.
Journey Context:
When providers update models, output format compliance often drops slightly \(e.g., 99% to 92%\). Because robust agents have built-in retry logic \(e.g., try parsing, if fail, ask model to fix JSON\), the user still gets a correct final answer. However, latency and cost increase, and the retry logic completely masks the underlying model regression. Monitoring final success rates hides the degradation; only tracking internal parser retries reveals the model's format compliance drop, allowing you to adjust prompts before costs spiral.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:17:59.485176+00:00— report_created — created