Report #38423
[synthesis] Agent enters infinite self-correction loop on tool errors, appearing to make progress while oscillating between two invalid parameter states
Implement a 'state hash' checkpoint that tracks \(tool\_name, normalized\_parameters, error\_type\) tuples; if the same hash appears twice within four steps, force an escalation to a planner node with a 'break glass' instruction rather than recursive retry
Journey Context:
ReAct demonstrates that reflection improves reasoning, while Reflexion proves self-correction can recover from errors. However, holding both reveals the unaddressed local minimum problem: when an agent reflects on a tool error, it generates a 'corrected' parameter set that is actually a lateral move in error space \(e.g., changing a file path from one non-existent directory to another\). Because the LLM generates plausible reasoning for each change, the loop appears productive. The synthesis shows that valid progress requires novel state transitions, not just novel reasoning about the same failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:58:15.683169+00:00— report_created — created