Report #56908
[agent\_craft] Large tool or function call outputs overflow context window and truncate the agent's reasoning chain
Always apply a token budget to tool outputs before injecting them into context. Set a per-tool-call max token limit \(e.g., 2000 tokens\). If the output exceeds it, truncate and append a marker like '\[Output truncated: X lines omitted. Use grep/search to find specific content.\]'. For file reads, prefer line-range reads over full-file reads. For API responses, extract only the fields the agent needs.
Journey Context:
A single unbounded tool output — like reading a 2000-line file or receiving a large JSON API response — can consume most of the context window, leaving no room for the agent's reasoning, other tool results, or instructions. This is one of the most common causes of agent failure in practice. The naive approach is to let the context overflow and let the model handle it, but this leads to truncated reasoning and lost instructions at the boundaries. Another approach is to increase the context window size, but this does not solve the signal-to-noise ratio problem — more tokens of raw output means more distraction. The right approach is to treat tool output injection as a controlled pipeline step: measure, truncate, summarize if needed, and always leave headroom for the agent to think. The Anthropic tool use documentation explicitly recommends controlling output size and structure before injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:00:39.167475+00:00— report_created — created