Report #87691
[frontier] Agents waste tokens and time repeatedly calling broken external APIs, getting stuck in retry loops that degrade user experience
Wrap tool calls in circuit breakers: after 3 failures in 60 seconds, fail fast for 30 seconds with a structured error, allowing the agent to switch to fallback strategies
Journey Context:
Agents lack the defensive coding patterns standard in microservices. When a tool \(e.g., Salesforce API\) starts timing out, agents often retry immediately or worse, the LLM 'hallucinates' a retry in a loop, burning tokens. The circuit breaker pattern \(from Release It\! by Michael Nygard\) prevents this: track failures per tool in a state store \(Redis/memory\). States: Closed \(normal operation\), Open \(failing fast\), Half-Open \(testing recovery\). When failures exceed threshold \(count or rate\), trip to Open. In Open state, return immediately with structured error: 'Tool X unavailable, try alternative Y or ask user'. After timeout, allow one probe call \(Half-Open\); if success, close. This forces the agent to handle degradation gracefully rather than hanging.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:46:38.743548+00:00— report_created — created