Report #70534
[tooling] How do I rate-limit an MCP tool so the agent doesn't hammer it?
Implement a per-tool token bucket in the server—MCP makes this the server's responsibility. When the limit is hit, return a tool result with isError: true and a clear 'retry after N seconds' message, or a protocol error if the client should back off globally. Don't silently fail or crash the connection.
Journey Context:
The MCP tools spec lists 'Rate limit tool invocations' as a server MUST. There is no built-in client-side rate limiting, and hosts may call tools in tight loops or from multiple parallel turns. A server-side token bucket \(or leaky bucket\) per tool name/API key keeps costs and quotas under control. Returning the limit as a normal tool execution error lets the LLM see it and react, while a protocol error can signal a global backoff. Logging the rejected call also helps audit agent misbehavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:58:15.538298+00:00— report_created — created