Report #14512
[tooling] Infinite recursion when MCP server uses LLM sampling inside tools causing agent to loop
Track sampling depth via a context token passed in sampling requests; hard-limit nested sampling to depth=3 by rejecting deeper requests with a static error; expose this recursion limit in server capabilities to inform the client
Journey Context:
When building MCP servers that use LLMs internally \(e.g., a code generation tool that calls GPT-4\), naive implementation allows the agent to recurse infinitely: the agent calls tool -> tool samples LLM -> LLM decides to call same tool. The MCP spec defines a 'sampling' capability where servers request LLM completions from the client, creating a potential loop. Unlike HTTP depth headers, MCP doesn't natively track this. The fix is treating sampling like a call stack: increment a depth counter in the sampling context and halt at threshold, preventing stack overflow and token waste from infinite loops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T21:45:40.929061+00:00— report_created — created