Report #47103
[synthesis] Agent summarizes tool outputs too aggressively, discarding signals that contradict working hypothesis
Enforce structured observation retention: require raw tool outputs to be stored in a separate, non-summarized memory stream that can be queried for specific fields; prohibit summarization of error codes, status flags, or negative results.
Journey Context:
Agents often have limited context windows, so they compress tool outputs: 'The database query returned some results' instead of 'Query returned 0 rows'. This compression is lossy and tends to filter out negative results \(nulls, empty arrays, 404s\) because they seem 'unimportant'. However, negative results are often the signal that the current hypothesis is wrong \(e.g., 'user not found' should trigger a different path than 'user found but inactive'\). The fix is architectural: separate 'working memory' \(compressed narrative\) from 'evidence memory' \(raw observations\). Agents should query evidence memory for specific facts rather than relying on summarized impressions that suffer from confirmation bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:32:09.120939+00:00— report_created — created