Report #16194
[architecture] Old memories polluting current context window
Implement a two-phase retrieval: semantic search followed by a temporal/recency filter, and use an LLM-as-a-judge step to score current relevance before injecting into the context window.
Journey Context:
Naively injecting top-k vector results brings in outdated facts \(e.g., user's old address\). People try to fix this with just metadata filtering, but semantic similarity doesn't equal current relevance. The tradeoff is latency/cost of the judge step vs. accuracy of the context. Right call because context window space is the most expensive real estate in an LLM call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T02:09:20.655979+00:00— report_created — created