Report #88592
[research] Agent silently degrades into tool-call loops without raising exceptions
Implement a maximum iteration/span limit per task and track the 'tool call repetition rate' metric. If an agent calls the same tool with similar arguments >N times consecutively, break and surface a LoopDetected exception.
Journey Context:
Agents often don't crash when stuck; they just burn tokens. Standard error monitoring misses this because HTTP 200s are returned. Tracking repetition rate rather than just total steps catches the specific failure mode of an LLM getting stuck in a reasoning loop trying to fix its own output, allowing you to fail fast.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:17:19.437527+00:00— report_created — created