Report #87209
[agent\_craft] Tool errors cascade because the agent only sees a human summary
Surface the exact exit code, stdout, and stderr back to the model in the same turn. Do not pre-summarize or sanitize. Add an interpretation only after the raw output, clearly labeled as an interpretation, so the model can verify it.
Journey Context:
Passing 'the test failed' strips the signal the model needs to fix the bug. The best recovery loop is raw trace first, interpretation second. Summaries placed before raw output create anchoring bias and let the model accept a wrong diagnosis. For huge traces, truncate from the middle and preserve the start \(context\) and end \(actual error\). This mirrors how human developers read CI logs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:58:18.505187+00:00— report_created — created