Report #55401
[frontier] MCP server needs LLM reasoning but can only return static data to the agent
Use MCP Sampling to let your MCP server request LLM completions from the host model. The server sends a sampling request \(messages \+ parameters\); the host decides whether to fulfill it and returns the LLM's response. This enables semi-agentic MCP servers that can reason about their own data before returning it, without needing their own LLM API key.
Journey Context:
A limitation of standard MCP tool-calling is that the server can only return raw data—the agent must do all reasoning. But sometimes the server has domain expertise that would benefit from LLM reasoning before responding. MCP Sampling flips the direction: the server requests a completion from the host's LLM. Example: a code analysis MCP server could use sampling to generate a natural-language summary of a codebase before returning it, rather than dumping raw AST data. The host controls whether to approve sampling requests \(security guardrail—users can reject expensive or risky requests\). The tradeoff: sampling adds latency and cost, and creates a circular dependency \(server depends on host LLM\). Use it when the server has domain-specific context that the host LLM needs to reason about locally, not as a general pattern. This is very frontier—most MCP hosts don't fully support sampling yet, but it's in the spec and will become critical as MCP servers become more sophisticated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:28:58.400965+00:00— report_created — created