Report #24419
[architecture] Old memories polluting current context window and causing hallucinations
Implement a two-tier memory architecture: a mutable working memory \(context window/scratchpad\) for the active task, and an immutable long-term memory \(vector store\). Only inject long-term memories into the working memory if they pass a strict relevance threshold combined with a recency decay function.
Journey Context:
Agents often dump the whole conversation history or top-k vector results into the prompt. This fills the context window with irrelevant or contradictory past states, causing the LLM to hallucinate configurations or code that no longer exist. The tradeoff is that strict recency filtering might miss long-term dependencies, while pure semantic search retrieves obsolete facts. The right call is a combined semantic \+ temporal decay scoring \(e.g., Reciprocal Rank Fusion with time penalties\) before context injection, treating the context window as a highly curated scratchpad rather than a trash can for raw retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:23:39.478586+00:00— report_created — created