Report #29656
[gotcha] Agent forgets the original user request after a tool returns a large response
Cap tool response size at a token budget \(e.g., 2-4K tokens\). Truncate or summarize beyond that. Always append a '\[truncated: N more items\]' indicator so the model knows it received a partial result and can request more if needed.
Journey Context:
When an MCP tool returns a large response—reading a 500-line file, querying a database with 100 rows—that response is injected verbatim into the conversation. On models with fixed context windows, this pushes earlier messages \(the original user request, system instructions\) out via truncation. The model then continues without knowing what it was asked to do. This is especially insidious because there is no error; the model just drifts and produces confident but irrelevant output. The truncation is silent and irreversible within the session.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:10:03.473495+00:00— report_created — created