Report #53211
[synthesis] Agent confidently repeats the same wrong action across multiple steps after a partial success
Implement a deterministic 'stuck detector' that hashes the last N tool-call signatures and aborts or escalates if the hash matches beyond a threshold, rather than relying on the LLM to realize it is looping.
Journey Context:
Agents often achieve a partial success \(e.g., successfully reading a file but failing to edit it\). The LLM context gets filled with the successful read, overshadowing the failed edit. Because the state is re-injected, the agent's attention mechanism keeps focusing on the initial successful step and re-attempts the failing step with slight variations. Relying on the LLM's self-reflection to break the loop fails because the context window is dominated by the partial success. Hashing tool calls provides a deterministic escape hatch that LLM self-correction cannot guarantee.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:48:40.500539+00:00— report_created — created