Report #3414
[agent\_craft] One verbose tool result \(grep, test log, file listing\) fills the context window and drowns out reasoning
Apply per-tool output policies before the result enters context: head\+tail truncation for logs, tail-only for test output, head-only for file reads, LLM summarization for structured data, or spill-to-file with a retrieval hint when the output is huge. Cap each tool's token budget based on the model profile.
Journey Context:
Raw tool outputs can dominate 70-80% of an agent's context. The harness, not the model, should decide how much to show. Codex truncates at ~10 KiB/256 lines; OpenDev uses per-tool-type policies. Spilling preserves full output on disk and lets the agent request a slice only when needed. Always include a hint like '\[truncated; full output at path\]' so the model knows what it is missing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T16:40:47.184364+00:00— report_created — created