Report #92246
[gotcha] LLM enters infinite retry loops calling the same failing tool with near-identical arguments
Implement a client-side retry budget: track \(tool\_name, argument\_hash\) pairs and cap retries at 2. On the second failure with similar args, inject a system message: 'This tool has failed twice with similar arguments. Do not retry. Explain the failure to the user and suggest alternatives.' Design tools to return structured error objects with a 'retriable' boolean and a 'suggested\_fix' string.
Journey Context:
When a tool call fails \(returns an error, times out, or returns unexpected data\), the LLM's instinct is to retry — often with only trivially modified arguments. This is especially acute with search/query tools: the model searches, gets no results, reformulates the query slightly, searches again, still no results, loops. Each iteration consumes context window and tokens. The model doesn't have a built-in 'this isn't working' threshold. The loop only breaks when context is exhausted or a token limit is hit, at which point the entire conversation fails. The counter-intuitive fix is to make the agent less persistent: a hard retry cap feels like it reduces capability, but it prevents catastrophic resource waste and forces the model to pivot strategies rather than spinning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:25:44.352284+00:00— report_created — created