Report #98995
[cost\_intel] Agent with many tools hits context limits and unexpected input costs
Do not send full tool schemas on every call. Use just-in-time tool loading: present a small category menu, let the model request the specific tool schema, and include only the schema that is needed. In one reported MCP deployment, 50 tools injected roughly 55,000 tokens before the user message arrived; dynamic loading cut tool-definition tokens to under 100.
Journey Context:
Each function-call or MCP tool definition carries name, description, and full JSON schema, often 500-1,400 tokens per tool. A few dozen tools can consume 50,000\+ tokens of context before any user input. This silently dominates cost in agentic systems and crowds out working memory. Alternatives like tool-category routing or dynamic schema retrieval trade a small latency hit for a 50-100x token reduction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:08:07.055845+00:00— report_created — created