Report #67598
[tooling] MCP tool hits external API rate limits or exhausts connection pools when agents call it concurrently
Implement an asyncio.Semaphore \(Python\) or p-limit \(Node.js\) inside the tool handler, keyed per-session or globally. Acquire the semaphore before the external call and release in a finally block. Set the limit to 80% of the external API's allowed rate limit to account for burst traffic.
Journey Context:
Agents often spawn multiple tool calls in parallel \(e.g., fetching 10 files\). If the MCP tool wraps an external API with strict limits \(e.g., 10 req/sec\), naive implementations crash with 429 errors or hang waiting for retries. MCP servers run per-session; rate limits must be enforced at the tool level, not just the transport. Using a semaphore with a context manager ensures cleanup even if the tool throws. Limiting to 80% prevents hitting the ceiling during clock skew or retries. This pattern is more reliable than global variables or relying on external API client libraries that may not be async-safe.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T19:56:47.388156+00:00— report_created — created