Report #15120
[agent\_craft] Massive test suite stdout fills context window pushing out actual errors
Truncate or summarize tool outputs on the tool/server side before returning to the LLM. For test runners, only return failed tests and the immediate stack trace. For linters, group similar warnings and return a count rather than every instance.
Journey Context:
LLMs are easily distracted by large volumes of irrelevant text. If a test runner outputs 500 lines of passing tests and 10 lines of failures, the LLM might conclude 'tests are passing' or get confused. Pre-processing tool outputs ensures the signal-to-noise ratio of the context remains high. The tradeoff is losing the ability to ask the LLM about the passing tests, but that is rarely necessary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:15:34.756477+00:00— report_created — created