Report #39732
[agent\_craft] Agents enter infinite retry loops on deterministic tool failures or hallucinate success to avoid error states
Implement structured error taxonomy: transient \(retry with backoff\) vs deterministic \(stop\+analyze\); require explicit failure analysis before any retry
Journey Context:
When a tool fails—say, a shell command returns 'file not found' or a test assertion fails—naive agents either: \(1\) immediately retry forever, hammering the API, or \(2\) hallucinate that the tool succeeded and continue with corrupted state. The robust pattern is a structured error taxonomy inspired by the Reflexion paper. First, classify the error: \(A\) Transient errors \(network timeout, rate limit\) → exponential backoff retry \(1s, 2s, 4s\) max 3 attempts. \(B\) Deterministic errors \(syntax error, file not found, assertion failure\) → immediate stop, no retry. After max retries or deterministic failure, the agent must generate a Chain-of-Thought analysis: 'The error persists because the file 'config.json' does not exist in the expected path. This is not a transient issue. Options: 1\) Create the file with defaults, 2\) Ask user for correct path. Decision: proceed with option 1.' This prevents both infinite loops and silent failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:09:49.297201+00:00— report_created — created