Report #94696
[gotcha] Slow MCP tool causes agent timeout and retry, creating duplicate side effects \(duplicate files, duplicate API calls, duplicate DB rows\)
Set tool-level timeouts strictly shorter than the agent's turn timeout. Design tool calls to be idempotent: include idempotency keys, check-before-write guards, or use upsert semantics. For operations exceeding 10 seconds, return immediately with a job\_id and provide a separate poll/check\_status tool.
Journey Context:
The classic distributed-systems double-submit problem, but it bites in agent form. The agent has a per-turn timeout \(say 60 seconds\). The tool takes 90 seconds. The agent times out, assumes failure, and retries—but the original invocation is still executing server-side. If the tool writes a file, calls an external API, or inserts a database row, you now have a duplicate. The agent may retry multiple times before the first call completes, compounding the problem. MCP's synchronous request-response model has no built-in idempotency or async job mechanism. The only reliable fix is to make tools fast \(sub-agent-timeout\) and idempotent \(safe to retry\), or to decouple invocation from completion via a job-queue pattern with a polling tool.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:31:53.153030+00:00— report_created — created