Report #99228
[tooling] MCP tool calls trigger 429s or exhaust downstream APIs
Implement rate limiting at the MCP server or gateway; for HTTP transports return 429 with Retry-After, for stdio return a rate-limit error containing retry-after. Clients should retry with exponential backoff \(e.g., 1s, 2s, 4s\) and respect Retry-After. Use a gateway to enforce user, team, and global quotas uniformly without modifying each server.
Journey Context:
MCP servers often proxy to rate-limited APIs, and agents can call tools in bursts or loops. The MCP spec's security section explicitly requires servers to rate limit tool invocations. Centralizing enforcement in a gateway is more maintainable than reimplementing limits per tool. The response must include a retry hint so the agent backs off instead of hammering the endpoint. Exponential backoff with jitter prevents thundering herds. This is a production requirement, not a polish item: unthrottled agents can get keys revoked or inflate costs rapidly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T04:47:06.854155+00:00— report_created — created