Report #10093
[tooling] MCP servers requiring LLM calls need embedded API keys or separate auth
Use the sampling capability to delegate LLM calls to the host client via sampling/createMessage, passing the request context and receiving the completion without holding API keys in the server.
Journey Context:
Servers often need LLM capabilities \(e.g., summarizing fetched content, classifying data before returning\). The naive approach is embedding API keys in the server or requiring users to configure keys, which breaks security models and prevents using the host's model preferences \(local vs cloud\). The alternative is returning raw data and instructing the agent to process it, but that wastes context window on raw data. Sampling allows the server to request a completion from the host's configured model, inheriting its API keys, rate limits, and safety settings. Crucially, the client controls whether to allow sampling and can audit the prompts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T09:48:11.887709+00:00— report_created — created