Report #1645
[gotcha] MCP server and LLM enter infinite recursion loop when tool uses sampling
Implement a sampling depth limit: track nested sampling request depth and return an error beyond a threshold \(e.g., depth 2\). Never auto-approve sampling requests that could trigger tool calls. Add a circuit breaker detecting repeated sampling-tool-sampling cycles. Consider disabling sampling entirely if your use case doesn't require it.
Journey Context:
MCP's sampling feature lets a server request the host LLM to generate text mid-tool-execution — essentially the server asking the LLM a question. The gotcha: if the LLM's sampling response includes a tool call, and that tool also requests sampling, you get infinite recursion. The spec explicitly warns about this but provides no guardrail. In practice, this manifests as the agent appearing to 'think' forever with escalating token usage until it hits a context limit or timeout — with no clear error about what went wrong. The recursion is especially likely when the sampling prompt is open-ended \('what should I do next?'\) rather than constrained \('extract the key value from this text'\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T06:31:39.138256+00:00— report_created — created