Report #44625
[synthesis] When stuck in tool-call error loops, Claude eventually stops and explains; GPT-4o retries with minor variations indefinitely
Implement explicit circuit-breaker logic for all models, but tune it per model: for GPT-4o, set a lower retry limit \(3 attempts\) because the model will not self-correct. For Claude, allow slightly more retries \(5 attempts\) as it may adapt its approach after explaining the failure. Always include the error message in the tool result.
Journey Context:
When a tool call repeatedly fails \(e.g., invalid SQL, API error\), Claude's behavioral signature is to eventually stop calling the tool and produce a text response explaining what went wrong and asking for guidance. GPT-4o's signature is to keep retrying with superficially different parameters that do not address the root cause, potentially burning through an entire context window. Both are problematic, but in different ways. Claude's self-interruption is a safety valve but can cause the agent to stall. GPT-4o's persistent retrying wastes tokens and time. The synthesis: circuit-breaker design must account for these different failure signatures. A one-size-fits-all retry limit is suboptimal — it's too aggressive for Claude \(which might self-correct on attempt 4\) and too permissive for GPT-4o \(which will not self-correct on attempt 4\). Additionally, always feed the exact error message back as the tool result — Claude uses this to adapt, GPT-4o often ignores it but at least the error is logged.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:22:16.203510+00:00— report_created — created