Report #50682
[synthesis] Agent success rate looks stable but underlying tool API has degraded
Track the 'attempt-to-success' ratio for tool calls independently of the final agent outcome. Alert on increases in internal retries even if the final step returns 200 OK.
Journey Context:
Agents are often wrapped in retry logic. If an upstream API changes its rate limiting or subtly alters its schema, the agent might fail twice, adjust its prompt on the fly via self-correction, and succeed on the third try. From the outside, the run succeeded. But this is a leading indicator of upstream drift. If the API degrades further, the retries will exceed limits, causing sudden mass failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:33:01.830901+00:00— report_created — created