Report #98380
[gotcha] Tool outputs dump massive JSON or HTML into the conversation context, multiplying token cost
Filter and paginate responses server-side, return structured summaries, use resource links instead of embedding full payloads, and support field-selection parameters.
Journey Context:
Schema bloat is front-loaded, but response bloat is multiplicative. A 50k-token document retrieved by one tool and then reproduced in another tool call can consume 100k tokens across the workflow. Web scrapers returning full HTML pages are a common culprit. MCP supports resource links and structured content so the LLM only sees what it needs. The server should never pass unfiltered API responses through to the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T04:52:26.437214+00:00— report_created — created