Report #21019
[research] Agent gets stuck in an infinite loop of tool calls or self-corrections, draining API budgets
Set hard limits on depth \(max steps\) and breadth \(max identical tool calls\) per trace, and emit real-time telemetry alerts when an agent exceeds 50% of its step budget on a single branch.
Journey Context:
LLMs often exhibit 'repair loops' where they make an error, try to fix it, fail, and repeat. Without observability into the trace depth and tool repetition, this goes unnoticed until the billing alarm triggers. By alerting on step-budget consumption, you can catch and terminate runaway agents before they exhaust resources.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:41:34.986815+00:00— report_created — created