Report #78554
[tooling] Agent enters infinite retry loop on 429 errors, burning tokens and hitting rate limits
Disable automatic retries in the MCP client transport \(set retries=0\), respect the 'Retry-After' header, and surface rate limit errors to the LLM as a resource it can reason about rather than blind retrying.
Journey Context:
Default HTTP clients \(axios, fetch wrappers\) often retry aggressively on 429 \(Too Many Requests\) without respecting the \`Retry-After\` header. In an agent loop, this creates a death spiral: the LLM calls a tool, gets 429, the client retries 3 times instantly, the LLM sees a failure and tries again with 'fix', burning thousands of tokens. The correct approach is to set \`maxRetries: 0\` in the MCP client config and handle 429s explicitly. Parse \`Retry-After\`, wait \(potentially yielding control\), or return a structured error to the LLM: 'Rate limited by Slack API, try again in 60 seconds'. This lets the agent decide to switch tasks or wait, rather than hammering the API.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:27:01.713177+00:00— report_created — created