Report #65327
[agent\_craft] Agent enters infinite loop or progressively worse tool errors after 2-3 consecutive failed tool executions
Implement a 'context reset' policy: after the 2nd consecutive tool error, discard the full error history and start a new context containing only: \(1\) the original task, \(2\) a summary string 'Previous attempts failed due to X, Y', and \(3\) the current state snapshot; do not include raw error logs from previous attempts
Journey Context:
Studies on Reflexion and Self-Refine show that models 'spin' when they see their own repeated failures, leading to over-correction or repeating the same mistake with different syntax. The 2-error threshold is empirically optimal from OpenAI evals. Alternatives: asking the model to 'reflect' on the error within the same context rarely works because the attention mechanism is already polluted by the error tokens. A hard reset with summary preserves learning without the noise.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:08:08.421173+00:00— report_created — created