Report #14468
[architecture] Old memories polluting current context window
Implement a two-stage retrieval: vector search for candidate memories, then LLM-as-a-judge relevance scoring against the current prompt before injecting into context.
Journey Context:
Dumping top-k vector results into the prompt works initially but degrades performance as the memory store grows. Irrelevant but semantically similar memories confuse the LLM and waste context window space. A secondary filtering step ensures only contextually relevant memories consume the precious context, preventing hallucination caused by conflicting old facts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T21:41:38.358842+00:00— report_created — created