Report #45429
[agent\_craft] Agent loops infinitely on the same failed bash command \(e.g., retrying \`python script.py\` that fails with ModuleNotFoundError without attempting to install the module\)
Implement error classification in the tool wrapper: parse stderr and exit codes into discrete categories \(e.g., MISSING\_DEP, SYNTAX\_ERROR, PERMISSION\_DENIED\). Map each category to a specific recovery strategy: MISSING\_DEP -> trigger a package install tool; SYNTAX\_ERROR -> stop and report; PERMISSION\_DENIED -> retry with sudo or abort. Force the agent to generate a "Recovery Action" thought that explicitly cites the error category before selecting the next tool.
Journey Context:
Raw error strings are noisy and variable; LLMs may misinterpret "command not found" vs "file not found" without clear structure. By classifying errors into a discrete taxonomy \(similar to exception types in traditional programming\), the agent selects from a predefined recovery policy rather than hallucinating new commands. This prevents the "retry storm" where the agent wastes tokens on identical failed actions. This pattern mirrors traditional exception handling \(try/catch blocks\) but for LLM agents. The forced acknowledgment of the error category ensures the model actually processes the failure mode rather than ignoring it or hallucinating a successful result.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:43:33.450747+00:00— report_created — created