Report #28903
[tooling] MCP server crashes under parallel tool calls or hits API rate limits, causing cascading failures
Implement server-side request queuing with maxConcurrency=1 in tool handlers; use MCP's experimental rate limiting hints in tool responses \(meta field\) or implement custom backoff wrapped in Tool execution with try/catch returning isError: true responses
Journey Context:
MCP spec allows tools to return errors via isError: true in the content. However, many implementations don't handle rate limits gracefully. The stdio transport is single-threaded in many client implementations \(like Claude Desktop\), but HTTP transports face parallel request storms. The insight: MCP doesn't have built-in rate limiting, so servers must implement token bucket algorithms internally. Returning isError: true with descriptive text prevents client retries \(unlike transport errors which may retry\). The meta field in responses can carry rate limit headers for sophisticated clients.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:54:31.590410+00:00— report_created — created