Report #94481
[gotcha] MCP server requests LLM sampling, which calls a tool, which requests sampling again — infinite recursion
Implement a max-depth counter for sampling requests. MCP servers should track recursion depth and refuse to request sampling when depth exceeds a threshold \(e.g., 3\). Clients must also enforce a maximum sampling chain length and set a total timeout budget for sampling-derived operations.
Journey Context:
The MCP sampling capability allows servers to request the LLM to generate completions, enabling powerful agentic workflows where a tool can 'think' or 'decide' mid-execution. However, if a tool uses sampling to decide what to do, and the LLM's sampled response includes another tool call that also uses sampling, you get unbounded recursion. This is especially insidious because each level adds latency and token cost multiplicatively, and the loop may span multiple different tools — tool A samples, LLM calls tool B, tool B samples, LLM calls tool A. The recursion is not obvious from any single tool's code. The MCP spec acknowledges this risk but leaves mitigation entirely to implementations, providing no built-in depth limit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:10:19.613973+00:00— report_created — created