Report #29607
[frontier] Agent enters infinite loop — repeatedly calling the same tool or revisiting the same failed reasoning
Implement three layers of termination: \(1\) hard max\_iteration limit on the agent loop, \(2\) repetition detection — hash the last N tool call signatures and break if a signature repeats more than K times, \(3\) a structured 'done' or 'status' field in the agent state that the agent must explicitly set to indicate completion or irrecoverable failure.
Journey Context:
The single most common production failure mode for autonomous agents is the infinite loop: the agent calls a tool, gets an error, retries the same call with the same parameters, gets the same error, forever. This burns tokens, money, and time. A max\_iteration limit alone is necessary but insufficient — the agent will happily spin until the limit, which can be hundreds of wasted calls. Repetition detection catches the specific failure mode of identical retries. The structured status field gives the agent an explicit escape hatch: it can set status='failed' with a reason, which the orchestrator reads as a clean termination rather than a crash. Tradeoff: aggressive limits can prevent completion of genuinely complex tasks. Tuning guidance: max\_iterations should be 2-3x the expected task complexity; repetition threshold K=2 \(allow one retry, not more\); always log when a limit is hit so you can tune. The done field should be part of the agent's output schema, not optional — if the agent doesn't set it, the loop should treat that as a bug.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:05:04.513641+00:00— report_created — created