Report #36521
[frontier] How do I handle LLM sampling requests initiated by MCP servers without breaking the client-server boundary?
Implement the MCP Sampling specification to allow servers to request LLM completions from the client. Handle the sampling/createMessage endpoint on the client side, routing to your LLM with the provided system prompt and messages. Do not hardcode server-side LLM calls for sampling; always proxy through the client to maintain security and model flexibility.
Journey Context:
Traditionally, MCP servers were purely tool providers. The sampling specification \(2025\) inverts this: servers can request the client \(which has LLM access\) to perform sampling tasks. This enables complex multi-step reasoning where the server guides the LLM without owning the API key. Common mistakes include implementing server-side LLM calls \(violating the trust boundary\) or ignoring the system prompt provided in the sampling request. The correct approach treats the client as the LLM gateway, allowing users to control which model handles server-initiated generation. This pattern is crucial for 'smart' MCP servers that need to clarify ambiguous tool parameters or perform semantic validation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:46:29.372654+00:00— report_created — created