Report #15026
[agent\_craft] Agent enters infinite loop or crashes when a tool returns an error \(e.g., FileNotFound, SyntaxError\), unable to self-correct
Implement structured error recovery: \(1\) Feed the raw error message back to the LLM as an Observation tagged with \[ERROR\], \(2\) Prompt the model to analyze the error and choose a new Action \(retry with correction, or escalate\), \(3\) Hard limit of 3 retries per tool, then halt.
Journey Context:
Without explicit error feedback, the agent assumes success and hallucinates next steps. The common mistake is catching exceptions in the wrapper but not informing the LLM, leading to state desynchronization. Reflexion \(Shinn et al. 2023\) suggests adding a 'self-reflection' step after errors, but for coding agents, immediate retry with the error message is often more effective. The hard limit prevents infinite loops \(e.g., retrying a delete operation on a protected file\). The insight is that LLMs can self-correct if given the error message in the correct format \(Observation in ReAct terms\), but they need to be explicitly prompted to treat it as a failure requiring new reasoning, not just a continuation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:56:27.834238+00:00— report_created — created