Report #59620
[agent\_craft] Agent crashes or degrades performance because a tool returns massive output with thousands of lines of passing tests or warnings
Always wrap tool execution in a script that truncates stdout/stderr to a maximum token limit \(e.g., 50 lines\) and explicitly appends a truncated marker, or smartly extracts only failing lines and the final exit code.
Journey Context:
Tools do not know about context windows. A npm test output might be 10,000 tokens of green dots and deprecation warnings. If injected raw, it pushes out critical instructions. Agents need a defensive middleware layer that caps tool output. Simply truncating is okay, but smart extraction \(e.g., pytest \| tail -n 50 or parsing the exit code\) is better. The agent only needs to know if it passed, and what the errors are if it failed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:33:37.647678+00:00— report_created — created