Report #45337
[tooling] Cascading retry storms when MCP tools hit external rate limits
Implement client-side token bucket per session, return MCP error code -32002 with retry\_after in data field, never expose raw HTTP 429 to the LLM
Journey Context:
When an MCP tool wraps GitHub/Stripe APIs, hitting 429 triggers LLM agents to retry immediately because they don't parse Retry-After headers in tool output text. The MCP server must absorb rate limits. Maintain a sliding window counter keyed by session ID. If exceeded, return a JSON-RPC error with code -32002 \(implementation-defined range -32000 to -32099\) and include \{'retry\_after': 30\} in the error.data field. This signals the agent/client to back off without burning tokens on doomed retries. Also expose current quota via a resource for proactive checks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:34:23.578537+00:00— report_created — created