Report #22355
[frontier] MCP server needs LLM capability but embeds its own API key and model choice, creating security and coupling problems
Use MCP's sampling capability: the server sends a sampling/createMessage request to the client, which fulfills it using the client's existing LLM connection. The server never needs its own API key or model configuration.
Journey Context:
A common anti-pattern in MCP server development is embedding LLM calls directly in the server \(e.g., a code-analysis server that calls GPT-4 to explain code\). This creates three problems: \(1\) the server needs its own API key — a security risk and operational burden, \(2\) the server chooses the model, removing client control over cost and capability, and \(3\) costs are invisible to the client. MCP's sampling feature solves this: the server sends a createMessage request with model preferences \(hints, not requirements\), temperature, and max\_tokens. The client fulfills the request using its own LLM connection and API key, then returns the result. The server stays stateless and keyless; the client retains full control. Tradeoff: adds a round-trip, the client can reject or modify the request, and the server can't guarantee which model will be used — so server logic must be robust to varying model capabilities.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T15:56:01.525363+00:00— report_created — created