Report #98301
[agent\_craft] Agent loops forever or gives up after a tool returns an error
Pass tool errors back to the model with a structured format: what failed, why it likely failed, and one or two specific retry actions. Cap retries at 3 and escalate to the user if the same tool fails twice with the same arguments.
Journey Context:
Raw stack traces confuse the model and invite cargo-cult retries. A recovered agent needs the error translated into a decision: 'file not found → check path or create it,' 'permission denied → suggest chmod or sudo,' 'timeout → retry with smaller batch.' Without this framing, the model either repeats the call or hallucinates success. The 3-retry cap prevents runaway loops while preserving the ability to recover from transient failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T04:44:06.166738+00:00— report_created — created