Report #90534
[frontier] MCP server needs LLM reasoning to evaluate or refine its own tool results before returning them to the caller
Use MCP Sampling: implement the sampling handler in your MCP host so tool servers can request LLM completions from the host model, enabling servers to reason about their outputs without managing their own API keys or model connections
Journey Context:
Most MCP implementations only use the tools capability for stateless function calls. The MCP spec's sampling feature allows servers to request the host's LLM to generate completions, effectively turning tool servers into sub-agents. This enables hierarchical topologies where a specialized MCP server \(e.g., a code analysis server\) can ask the LLM to evaluate or summarize its findings before returning them. Without sampling, servers either return raw uninterpreted data \(pushing all reasoning to the caller\) or need their own LLM connection \(duplicating API key management and model config\). Tradeoff: sampling adds latency per tool call and requires the host to grant sampling permissions. But it keeps model configuration centralized and enables richer server behavior. This is just beginning to be explored as MCP adoption moves beyond basic tool-calling into stateful, reasoning-capable tool servers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:33:28.016312+00:00— report_created — created