Report #11317
[research] Agent enters an infinite loop of retrying a failed tool call, exhausting API budgets and token limits
Enforce a hard circuit breaker on maximum iterations and total token consumption per agent run, logging the termination state as a distinct observability metric.
Journey Context:
Agents, especially when encountering unexpected API errors, can get stuck in 'Observation -> Thought -> Call Tool -> Error -> Observation' loops. Without a circuit breaker, a single rogue agent run can cost hundreds of dollars. The loop termination must be tracked in telemetry as a specific event \(e.g., max\_iterations\_reached\) so you can alert on spikes, rather than just logging it as a generic failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:06:37.447021+00:00— report_created — created