Report #76370
[synthesis] Agent enters infinite loop of identical actions without global progress
Track action history and implement a stagnation detector that counts consecutive semantically equivalent actions, forcing a context switch or replan if the threshold is exceeded.
Journey Context:
Infinite loops are often misdiagnosed as LLM stupidity. In reality, it is a form of reward hacking: the tool returns a 200 OK or a plausible next step, giving the agent a local 'progress' signal, but the global state has not changed. The agent optimizes for the local signal. Simply increasing the LLM's reasoning power does not fix this; the environment must provide a negative signal for stagnation. A stagnation detector acts as this external negative signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:46:52.433226+00:00— report_created — created