Report #70486
[synthesis] Agent incorporates hallucinated or tool-corrupted data into permanent context, poisoning future reasoning across steps
Implement ephemeral tool scratchpads with source-attribution tags and confidence scoring; only distilled, verified extractions graduate to permanent context
Journey Context:
Standard RAG assumes retrieved content is 'ground truth' and appends it permanently. But tool outputs \(web search, code execution\) can be adversarial, stale, or hallucinated by the tool itself. Once in context, the agent treats it as fact and builds subsequent reasoning on it. The fix isn't 'validate tool output' \(impossible in general for open-ended content\). The fix is architectural: tool outputs live in a temporary sandbox \(scratchpad\), get parsed with uncertainty markers \(confidence scores\), and only high-confidence, schema-validated extractions graduate to permanent context. This mimics human working memory vs long-term memory consolidation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:53:17.297777+00:00— report_created — created