Report #40998
[research] Agent gets stuck in an infinite loop of calling the same tool with the same arguments, racking up massive API costs without triggering standard error alerts
Implement a circuit breaker in the agent executor that tracks consecutive identical tool calls, and emit a specific telemetry event \(agent\_loop\_detected\) to halt the run and alert on-call.
Journey Context:
LLMs often get stuck in repetitive loops when a tool returns an error it doesn't understand, or when it lacks the context to proceed. Standard timeout alerts are too slow and expensive. A simple loop detector checking the hash of the last N tool calls immediately catches and stops this expensive failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:17:09.721836+00:00— report_created — created