Report #91798
[synthesis] High agent latency is dismissed as model slowness when it is actually silent tool failure loops
Track the ratio of tool calls to successful tool outcomes per run. If the ratio exceeds 1.5:1, trigger an alert even if the final LLM output is a 200 OK.
Journey Context:
Agents are designed to be resilient; they retry failed tool calls with modified arguments. A subtle API change causes a tool to fail 5 times before the agent finally phrases the argument correctly. The trace ends in success, so standard error monitors don't fire. However, the run cost 5x the tokens and took 5x the time. Latency is treated as a performance metric, not a failure metric, hiding the degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:40:33.229414+00:00— report_created — created