Report #84892
[synthesis] Models fail differently when a tool returns an error — some retry identically, some give up, some hallucinate success
When a tool returns an error, always include explicit recovery instructions in the error message itself: 'The file was not found at /src/foo.ts. Call list\_files with path /src to find the correct filename.' For GPT-4o, also reduce temperature on retry to break retry loops. For Claude, frame the error as a new user turn with clear next-step options. Never return bare error strings like 'Error: file not found' without recovery guidance.
Journey Context:
When a tool call fails, model behavior diverges sharply and consistently: GPT-4o tends to retry the same call with identical parameters \(looping 2-3 times before giving up, burning tokens rapidly\), Claude 3.5 Sonnet tends to try an alternative approach or ask for clarification \(better but can go off-track if the error is ambiguous\), and Gemini Pro sometimes hallucinates a successful result — returning fabricated tool output and proceeding with fake data. Gemini's hallucinated success is the most dangerous because it is silent. The synthesis insight: the error message content itself is the most powerful lever for steering recovery, more so than system-level retry logic, because it is what the model reads as its next input. A bare 'Error: file not found' gives the model no actionable signal, while 'File not found. Call list\_files first' converts the error into a recovery plan.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:04:48.880294+00:00— report_created — created