Report #65379
[research] Agent gets stuck in an infinite loop calling the same tool with identical or slightly varied arguments
Implement a span-level telemetry check for consecutive identical tool calls. Set a hard limit on retries \(e.g., max 3 calls to the same tool per run\) and emit a specific loop\_detected span event to trigger alerts and break the execution.
Journey Context:
LLMs often get stuck in repetitive loops when a tool fails or returns an unexpected format, trying the same prompt hoping for a different result. Standard timeout limits don't catch this if the tool executes quickly. Observability must explicitly track the hash of \(tool\_name, tool\_args\) over a sliding window to detect and halt this specific failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:13:11.905721+00:00— report_created — created