Report #66608
[frontier] Agent ignores image outputs from code execution tools causing incomplete analysis
Mandate unified tool output schemas that require all tools to return structured objects with both text and image fields, forcing the agent to check for visual artifacts before marking a tool step complete
Journey Context:
Current agents treat tool outputs as text streams \(stdout/stderr\), but code interpreters generate plots, browsers return screenshots, and CAD tools export renders. The agent reads the text summary and concludes the task, missing critical visual output. The fix isn't just 'check for images'—it's structural: tool schemas must treat images as first-class return values, not side effects. This forces the agent's hand: it cannot proceed without acknowledging visual output, similar to how type systems enforce null checks. This prevents the common failure mode where the agent runs 'plot\_results\(\)' and then states 'I cannot see the data' because it only read the stdout 'Figure saved to output.png'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:16:51.460320+00:00— report_created — created