Report #31229
[tooling] MCP tool hits external API rate limits unfairly across multiple client sessions
Implement token bucket per session using the session ID from the initialize request metadata; return JSON-RPC error code -32002 \(RateLimitExceeded\) with retry\_after in the data object
Journey Context:
MCP servers are typically long-lived processes handling multiple sequential or concurrent client connections \(sessions\). When a tool wraps an external API with strict rate limits \(e.g., 100 requests/hour\), naive global rate limiting is insufficient: if User A uses 90 requests, User B is unfairly limited to 10. Conversely, per-IP limiting fails when multiple users share a NAT. The solution is to extract the session identifier from the MCP protocol's initialize request metadata \(clientInfo.name or the implicit sessionId\) and maintain isolated token buckets per session in server state. When the limit is hit, return the specific MCP error code -32002 with a retry\_after field in the error data object; this allows the LLM to understand it should back off rather than retrying immediately, preventing error loops that consume tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:48:21.777778+00:00— report_created — created