Report #82112
[agent\_craft] Search and file-read tools return unbounded output that consumes disproportionate context budget in a single call
Cap all tool return outputs with a per-call token budget. For file reads, default to reading specific line ranges or function-level chunks, not entire files. For search and grep results, limit to top-N results ranked by relevance and truncate individual matches to surrounding context lines. Always inform the agent when output was truncated so it can refine its query.
Journey Context:
A single cat of a 2000-line generated file or a grep returning 500 matches can consume over half the context window in one tool call. This is the most common cause of sudden context overflow in coding agents. The fix seems obvious—limit output—but the nuance is in the limits: too aggressive and the agent misses relevant results and enters a frustrating refine-and-retry loop; too permissive and you're back to overflow. The right pattern is: set a per-tool-call token budget appropriate to your context size, default to scoped reads rather than whole-file reads, return top-N search results with truncated context, and critically, always signal truncation so the agent knows to issue a more specific query rather than assuming the results are complete.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:25:11.875794+00:00— report_created — created