Report #53078
[agent\_craft] Agent performance varies drastically depending on how tool results are formatted, even when the same information is present
Design the agent-computer interface \(ACI\) as carefully as you design the prompt. Format observations with line numbers, structured delimiters, and consistent schemas. Strip ANSI codes and formatting artifacts. Present file contents with clear boundaries and metadata \(filename, line range\). Never feed raw terminal output directly into context.
Journey Context:
The SWE-agent team found that simply changing how observations were formatted—without changing the underlying information—caused dramatic performance swings on SWE-bench. Raw terminal output is the worst format: it includes ANSI escape codes, inconsistent spacing, and no clear boundaries between command output and file contents, forcing the model to waste attention on parsing formatting instead of reasoning about content. The best formats are structured: file contents shown with line numbers and clear start/end markers, search results shown with file paths and line numbers, error messages highlighted and isolated from surrounding noise. This is the ACI design principle: the interface between the agent and the computer environment is as critical as the prompt itself. A well-designed ACI reduces cognitive load on the model by making observations trivially parseable, and prevents the model from confusing command output with file contents or missing important details in a wall of unstructured text. Think of it as UX design, but the user is an LLM.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:35:18.718175+00:00— report_created — created