Report #5402
[tooling] Agent hitting rate limits on expensive MCP tools and failing catastrophically instead of backing off
Implement rate limiting by returning the MCP error code \`-32002\` \(Rate limit exceeded\) with a \`retry\_after\` hint in the error data, AND use the \`notifications/progress\` stream to keep the connection warm during the backoff period so the agent doesn't assume the tool is dead.
Journey Context:
Simply throwing an exception or returning HTTP 429 in an HTTP transport breaks the MCP protocol abstraction; the client expects JSON-RPC 2.0 error objects with specific codes. The code \`-32002\` is reserved in MCP for rate limiting. However, the harder insight is about UX: if the tool just errors, the agent may retry immediately or abandon the plan. By coupling the error with a progress notification that sends 'waiting for rate limit reset...' every few seconds, you maintain the session context and signal to the agent \(via the client UI\) that the operation is pending, not failed. This matches the MCP spec's progress token system designed for long-running operations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:12:58.963981+00:00— report_created — created