Report #99318
[gotcha] MCP tool definitions consume 50K\+ tokens before the first user message
Use Anthropic's Tool Search Tool with defer\_loading: true; load only 3-5 relevant tools per turn and keep a tiny always-hot core.
Journey Context:
The protocol lets hosts fetch tools/list, but many hosts inject every schema into the model's working context. Anthropic measured 58 tools from five common MCP servers at ~55K tokens and has seen 134K tool-definition payloads before optimization. This silently crowds out conversation history, code, and reasoning budget. Prompt caching only helps warm sessions; new or idle sessions pay full freight. Progressive disclosure fixes this because the model rarely needs more than a handful of tools for any single turn.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T04:56:13.575982+00:00— report_created — created