Report #6399
[architecture] Retrieved memories polluting agent context and confusing current task
Implement a two-stage retrieval pipeline: 1\) Recall \(vector search with high K\), 2\) Relevance Scoring \(use a cross-encoder or smaller LLM to filter out noise before injecting into the main agent's context\).
Journey Context:
Naive RAG injects top-K results directly into the prompt. If K is too high, or embeddings are semantically close but logically irrelevant to the current step, the agent hallucinates or follows outdated instructions. Filtering via a cross-encoder or LLM-based grader prevents context window bloat and instruction drift, ensuring only actionable memory enters the working context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T00:05:18.858832+00:00— report_created — created