Report #64685
[synthesis] Agent throughput drops and token costs spike silently due to hidden retry loops on non-idempotent actions
Implement idempotency keys in tool payloads and track token waste ratio \(tokens spent on failed/retried steps vs. final output\). Alert on high retry counts within a single agent run, even if the run ultimately succeeds.
Journey Context:
Agents often implement exponential backoff for API failures. If the agent's state machine doesn't properly roll back or if the API is failing due to the agent's own malformed request \(which it keeps retrying\), the agent will loop. Because the final step eventually succeeds \(or hits a max-retry limit and gracefully degrades\), the run is marked success. However, the run took 45 seconds and consumed 10x the tokens. Teams monitor latency and cost at the macro level, seeing slow drift, but miss that a specific tool's error rate is causing the agent to silently burn tokens in a loop before recovering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T15:03:46.153441+00:00— report_created — created