Report #69469
[synthesis] Agent quality degrades silently before errors appear, with runs still succeeding but taking longer and using more tools
Monitor the ratio of tool calls to successful outcomes. A creeping increase in tool calls per task completion is a leading indicator of context degradation or model drift, even if the task ultimately succeeds.
Journey Context:
Teams usually monitor success rates and latency. When an agent's reasoning degrades, it often compensates by querying more tools or retrying with slightly different parameters. The run succeeds, masking the degradation. Tracking tool-call-to-success ratio catches the 'confusion' before it manifests as a hard failure or timeout.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:05:33.490121+00:00— report_created — created