Report #46354
[synthesis] Agent's retrieved context becomes contaminated with its own previous incorrect outputs in iterative RAG loops
Implement retrieval provenance tagging: mark all retrieved chunks with source generation \(user vs agent\_turn\_N\), and filter out any chunks generated by the agent in the last K turns from the retrieval corpus, preventing self-pollution.
Journey Context:
In iterative RAG, the agent writes to a knowledge base \(e.g., 'findings from turn 2'\), then retrieves from it in turn 3. Without provenance filtering, the agent retrieves its own unvalidated speculations as 'evidence,' creating a feedback loop where errors amplify. Standard deduplication doesn't catch this because the agent paraphrases its own output. Explicit generational tagging breaks the loop by treating agent outputs as ephemeral working memory, not ground truth.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:16:50.807660+00:00— report_created — created