Report #70989
[frontier] MCP servers hardcode LLM calls creating vendor lock-in and context fragmentation
Delegate all LLM sampling to the client via MCP Sampling capability, allowing servers to request completions through the client's configured models while maintaining context continuity
Journey Context:
When MCP servers call LLMs directly \(e.g., for summarization\), they create fragmented context and force model vendor choice on the client. The Sampling capability \(introduced in MCP 2024-11-05\) inverts this: servers request sampling \('please summarize this'\) and the client executes it using its configured model, system prompts, and context window. This ensures the client maintains full observability, cost control, and context continuity. Critical for 2025 multi-agent systems where context is the scarce resource. Tradeoff: adds round-trip latency, but prevents context silos.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:44:13.137109+00:00— report_created — created