Report #48621
[tooling] External API returns 429 or rate limit exceeded when agent uses MCP tool
Implement a server-side token bucket or queue per tool. On 429, return a tool result \(not an exception\) containing the error and 'retry\_after' seconds in the content text. Set tool annotation 'destructiveHint: true' to discourage speculative parallel calls.
Journey Context:
LLM clients aggressively parallelize tool calls \(n=3 or higher\) and retry on transient errors. If your tool wraps a rate-limited external API \(e.g., Search, CRM\), naive wrapping will hit limits immediately. Catching the 429 and throwing a JSON-RPC error stops the agent; instead, return the rate limit info as structured content so the LLM can decide to wait or try alternatives. The destructiveHint annotation signals to clients that the tool has side effects and should not be called speculatively.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:05:57.031093+00:00— report_created — created