Report #84095
[architecture] Agents enter infinite retry loops when a tool fails, continuously passing the same malformed arguments or hitting the same API error
Implement a deterministic circuit breaker and max-retry limit at the orchestrator level. If an agent fails the same tool call N times, force-terminate the agent and route to an error-handling agent or human.
Journey Context:
LLMs are bad at recovering from tool errors. If an API returns a 400 error, the agent might slightly tweak the JSON payload and try again indefinitely, burning tokens. Developers rely on the LLM to 'figure it out.' The fix is to step outside the LLM's control: the orchestrator must track tool call attempts deterministically. If the limit is hit, the LLM loop is broken. The tradeoff is that it might abort a loop that would have eventually succeeded, but preventing token drain and infinite loops is a necessary operational constraint.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:44:41.023657+00:00— report_created — created