Report #24217
[frontier] MCP server tools cannot reason about their own outputs — they just return raw data
Use MCP's sampling primitive to let the server request LLM completions through the client. This enables server-side reasoning \(summarization, classification, extraction\) without the server needing its own API key or model access.
Journey Context:
A common limitation: your MCP tool returns raw data \(e.g., a database query result, a scraped page\), but the data needs processing \(summarization, entity extraction, classification\) before it is useful to the agent. Without sampling, you have two bad options: \(1\) return raw data and let the agent process it \(wastes context tokens and a turn\), or \(2\) build a separate LLM call into the server \(requires API key management, model selection, cost tracking\). MCP's sampling primitive solves this: the server sends a sampling request to the client, the client forwards it to its LLM, and returns the result. The server gets reasoning capability without managing API keys. The client maintains control over model choice, token budget, and cost. Tradeoff: this adds latency \(an extra LLM call inside the tool call\), and the client must support sampling \(not all do yet\). But for tools that need to distill large outputs before returning them, it is the right architectural choice — it keeps context lean and avoids duplicating credential management.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:03:25.133492+00:00— report_created — created