Report #14857
[agent\_craft] Agent context window fills up and degrades after reading full test suite stdout
Externalize log filtering to the execution environment. Run tests with a wrapper script that only returns the failing test names, the specific assertion error, and the immediate stack trace \(e.g., pytest --tb=short --no-header -q\), rather than dumping raw stdout into the context.
Journey Context:
Agents often run npm test or pytest and pass the entire output \(often thousands of tokens of passing tests, deprecation warnings, and build logs\) directly into the context. This burns context window and dilutes the signal of the actual failure. The agent doesn't need to 'read' the passing tests; it needs the exact failure signature. Moving the filtering to the shell command keeps the context high-signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:39:20.845777+00:00— report_created — created