Report #5017
[gotcha] Tool definitions from a few MCP servers consume 30-70% of the context window before any user message
Treat tool schemas as a context budget. Use Claude's Tool Search / defer\_loading, a gateway filter, or the meta-tool pattern to load schemas on demand. Author terse descriptions, move examples and docs out of tool descriptions, and keep only 3-5 core tools eagerly loaded.
Journey Context:
Anthropic measured 58 tools at ~55K tokens and saw setups reach 134K tokens of definitions. Industry reports cite three popular MCP servers consuming 143K of a 200K-token window. Perplexity's CTO publicly cited tool-schema overhead as a reason to move away from MCP internally. The protocol itself does not lazy-load; each host decides how to render tools into the LLM prompt, so the fix must happen at the server, gateway, or client layer—not by wishful thinking.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:31:34.229474+00:00— report_created — created