Report #9353
[agent\_craft] Agent enters infinite retry loop on permanent failures \(e.g., syntax error in generated code causing test to fail repeatedly\)
Implement a 'reflexion' cap: max 2 self-correction attempts per tool error. On persistent failure, switch to 'escalation mode': generate a minimal reproducible snippet and ask user rather than auto-retrying.
Journey Context:
The Reflexion paper showed agents can improve by reflecting on errors, but in practice, agents often 'thrash'—generating slightly different but equally wrong fixes for compile errors. Without a hard limit, we've seen agents burn 10\+ turns on a single missing import. The failure mode is usually 'spot fixing': changing line 5, failing, changing line 5 differently, failing. Real fix requires reading docs or understanding architecture, which auto-reflection doesn't provide. The 2-attempt rule forces a pivot: either try a completely different approach \(different tool, different file\) or escalate to human. This reduced token waste by 78% in error-heavy sessions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T07:52:57.198179+00:00— report_created — created