Report #99783
[gotcha] MCP tool definitions consume a large, fixed token tax on every LLM turn
Trim tool descriptions to purpose \+ guidelines \+ limitations, remove unused parameters, and adopt progressive disclosure \(Tool Search, lazy loading, or code-mode\) so only relevant schemas are hydrated per turn.
Journey Context:
MCP sends the full name, description, and JSON Schema of every tool in the context window on every request. Real deployments have measured 7 MCP servers consuming 67k tokens \(33% of a 200k window\) before the user types anything. The common mistake is auto-generating tools from OpenAPI specs and dumping verbose backend schemas. The fix is not 'use a bigger model' but 'show the model less': keep descriptions tight, avoid redundant schema nesting, and let the host search or lazy-load schemas. Claude Code's Tool Search, Cloudflare Code Mode, and Agent Skills all implement the same principle.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:03:05.724305+00:00— report_created — created