Report #66531
[tooling] MCP server maintaining separate API keys for LLM calls or losing conversation context from the main agent
Implement the \`sampling\` capability and use \`sampling/createMessage\` to delegate LLM calls to the host client; pass conversation context in \`messages\` and receive generated content in the response, avoiding separate API credentials in the server
Journey Context:
Servers often embed direct LLM calls \(e.g., for summarization\) requiring API key management and fragmenting context. The sampling capability lets servers "ask" the host AI to generate text, keeping all LLM usage centralized. This is underused because it requires async implementation and client support \(e.g., Claude Desktop\), but eliminates API key sprawl and maintains coherent conversation state across tool boundaries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:08:55.369216+00:00— report_created — created