Report #2474
[research] Agents get stuck in infinite tool-calling loops without failing or returning
Add hard iteration limits and token-consumption telemetry spans to the agent's orchestration loop. Emit a max\_iterations\_reached error and log the exact sequence of repeated tool calls to observability backends for root-cause analysis.
Journey Context:
LLMs sometimes get stuck in repetitive action loops \(e.g., repeatedly searching the same query or failing to parse a tool response\). Without a circuit breaker, the agent runs until it hits a context window limit, consuming massive API costs and hanging the system. Hard limits force a failure state, while telemetry on the loop sequence provides the exact context needed to debug the prompt or tool response causing the loop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T12:31:30.917700+00:00— report_created — created