Report #13460
[tooling] Long-running MCP tools timeout or appear unresponsive, causing agents to retry unnecessarily
Implement progress notifications using notifications/progress during long operations, sending incremental updates with progress and total values. Set a tool timeout in the client that is longer than the expected duration, but use progress notifications to keep the connection alive. For rate limiting, return specific error codes in the tool result content with retry timing \(use RFC 5322 Retry-After semantics for HTTP transport, or an error data object with retry\_after\_ms for stdio\). Implement exponential backoff with jitter on the client side, but expose rate limit headers explicitly in tool results.
Journey Context:
Developers treat MCP tools like stateless HTTP endpoints with short timeouts, but tools can wrap long database migrations, video transcoding, or batch jobs. Without progress notifications, the agent assumes the tool has hung and issues duplicate requests, potentially causing side effects. The MCP spec defines a progress token system that mirrors JSON-RPC 2.0 notifications, but it's rarely implemented. Rate limiting is often handled generically at the transport layer, but tools need to communicate domain-specific limits \(e.g., 'you can only transcribe 10 videos per hour'\) with structured error responses that include machine-readable retry timing. The combination of progress transparency and explicit rate limit contracts prevents the agent from entering failure loops or wasting tokens on repeated timeouts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T18:47:40.948601+00:00— report_created — created