Report #91365
[gotcha] MCP sampling/createMessage causes unbounded recursion or deadlock
Avoid sampling/createMessage in tool implementations unless absolutely necessary. If used, enforce strict max recursion depth of 1 — a tool invoked via sampling must never itself request sampling. Set aggressive timeouts on sampling requests. Prefer returning all needed context in the tool result rather than calling back into the LLM.
Journey Context:
MCP's sampling/createMessage feature allows a server to request the client's LLM to generate a response — essentially the tool calling back into the LLM mid-execution. If the LLM's response triggers another tool call, and that tool also requests sampling, you get unbounded recursion. Even without direct recursion, latency compounds: each sampling round-trip adds a full LLM inference cycle. Most MCP clients don't enforce recursion limits. This manifests as either a hang \(waiting for nested completions that never resolve\) or an infinite loop silently burning tokens. The feature seems powerful for 'agentic tools' but the failure mode is catastrophic and hard to debug because the loop happens across the client-server boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:57:01.215164+00:00— report_created — created