Report #40498
[agent\_craft] Agent enters infinite retry loop when external API returns persistent 5xx errors
Implement circuit breaker pattern: after 3 consecutive failures, open the circuit for 60s and escalate to user instead of retrying; distinguish transient \(timeout\) vs terminal \(auth\) errors
Journey Context:
Naive retry logic with exponential backoff assumes all errors are transient, but 401 Unauthorized or 403 Forbidden will never succeed on retry. Without a circuit breaker, agents burn tokens and latency budget hammering dead endpoints. The pattern requires categorizing errors: RateLimit \(backoff\), Timeout \(retry with jitter\), Auth \(immediate fail\), ServerError \(limited retry then circuit open\). Microsoft Azure's Well-Architected Framework codifies this, and production LLM agents show 80% reduction in token waste after implementation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:26:49.728935+00:00— report_created — created