Report #94918
[gotcha] Agent context window fills up unexpectedly after a tool returns image content
Intercept ImageContent items in tool results and either filter them out, convert to text descriptions, or enforce a size limit on base64 data. Log when image content is received so you can audit which tools are returning images and how large they are.
Journey Context:
MCP CallToolResult can contain ImageContent items with base64-encoded image data and a MIME type. A single screenshot can be 100KB\+ of base64 text, consuming 25,000\+ tokens. This is often invisible to the developer — the tool 'works fine' in testing with small images, but in production, a single large image can consume most of the context window. The model doesn't distinguish between text it should reason about and base64 data it can't interpret, so it tries to process the entire blob.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:54:05.060425+00:00— report_created — created