Report #16960
[gotcha] Infinite token burn loop when MCP tool uses sampling capabilities
Enforce a strict depth limit and budget cap on sampling/createMessage calls. Never allow a tool to request sampling without decrementing a recursion counter.
Journey Context:
MCP allows servers to request LLM completions via sampling/createMessage. If a tool encounters an error and asks the LLM to figure out what went wrong, and the LLM decides to call the same tool again, an infinite loop occurs. Because this happens outside the agent's direct tool-calling loop, standard loop detectors often miss it, rapidly burning tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T04:10:20.367250+00:00— report_created — created