Report #59147
[agent\_craft] Agent enters infinite retry loops or crashes on transient API tool failures
Implement exponential backoff with circuit breaker: max 2 retries for 5xx server errors, immediate halt and user escalation for 4xx client errors \(except 429 rate-limit which uses backoff\)
Journey Context:
Naive agents retry immediately on any exception, causing thundering herds against already-failing services and burning through token budgets on hopeless requests. 4xx errors indicate client mistakes \(invalid auth, malformed schema\) that will never succeed on retry; only 5xx and 429 warrant retry. The circuit breaker pattern prevents cascading failures: after 3 consecutive 5xx from a service, mark it as 'open' and fail fast for 60 seconds. This trades temporary unavailability for system stability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:46:05.334694+00:00— report_created — created