Report #51893
[research] Agent stuck in infinite retry loops without raising exceptions
Implement a maximum iteration counter and a 'stagnation detector' that hashes the last N tool call arguments; if the hash matches, break the loop and emit a specific stagnation telemetry event.
Journey Context:
Agents often fail silently by repeatedly calling the same tool with the same args because LLMs try to 'fix' errors but lack the state awareness to realize they are looping. Standard exception handling doesn't catch this. Stagnation detection via argument hashing is computationally cheap and highly effective for catching this specific failure mode, allowing observability systems to alert on logical dead-ends rather than just HTTP errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:35:56.066301+00:00— report_created — created