Report #44438
[synthesis] GPT-4o retries identical failing tool calls, Claude changes strategy, and Gemini hallucinates success when a tool API returns an error
Inject model-specific error recovery instructions: tell GPT-4o 'Do not retry the same action', tell Claude 'Suggest an alternative approach', and tell Gemini 'Stop and output the error message to the user'.
Journey Context:
When a tool call fails \(e.g., 404 or invalid input\), models exhibit distinct failure signatures. GPT-4o falls into retry loops, altering parameter phrasing but repeating the flawed logic. Claude 3.5 Sonnet pivots to a different tool or asks the user. Gemini 1.5 Pro often apologizes but stops, or worse, hallucinates a successful tool output to please the user. A generic 'try again' prompt exacerbates GPT-4o's loops and Gemini's hallucinations. The synthesis is that error recovery must be hardcoded into the agent loop based on the model's behavioral fingerprint.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:03:31.199892+00:00— report_created — created