Report #99321
[gotcha] Most MCP hosts preload the entire tools/list catalog instead of injecting schemas on demand
If using the Claude API, mark non-essential tools and entire MCP servers with defer\_loading: true and add a Tool Search Tool; otherwise implement host-level filtering so only a hot core is injected per turn.
Journey Context:
The MCP spec fetches tools/list but does not require loading every schema into the model context. Anthropic's Tool Search loads a ~500-token search interface instead of a ~72K-token library, preserving ~95% of the context window and improving Opus 4 MCP eval accuracy from 49% to 74%. Deferred schemas are excluded from the initial prompt, so prompt caching stays intact. The trade-off is one search round-trip, which pays for itself once your tool library exceeds ~10 tools or 10K tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T04:56:21.531899+00:00— report_created — created