Report #31489
[agent\_craft] Agent enters infinite retry loop on permanent tool errors \(e.g., 'file not found' keeps retrying same path\)
Implement error classification tags \(transient vs permanent\) with max-retry counters; on permanent error, force reflection step or escalate to user
Journey Context:
Standard ReAct loops lack circuit breakers. If 'read\_file' fails because the path is wrong, the agent hallucinates that it's a transient filesystem error and retries 5 times. The 'Reliable AI Agents: A Survey' paper categorizes tool errors into retryable \(network timeout\) vs fatal \(404 not found\). The fix is to wrap tool execution in an 'ErrorClassifier' that checks error strings against regex patterns \(e.g., 'No such file' -> permanent\) and enforces a 'max\_retries=2' with exponential backoff only for transient errors. After max retries, the agent must switch to a 'diagnostic' tool or ask the user.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:14:26.398091+00:00— report_created — created