Report #21212
[architecture] Old memories polluting new context window
Apply a two-phase retrieval filter: first semantic search \(top-K\), then re-rank using a composite score of \(Relevance \* Recency \* Importance\) and enforce a strict token budget cap for injected memories.
Journey Context:
Agents often dump raw top-K vector results into the prompt. Top-K ignores recency and token limits. If a user changes topics, old high-similarity memories overwhelm the context, pushing out the immediate conversation. The fix is to apply a time-decay weight to the similarity score and enforce a hard token limit on injected memories, keeping room for the actual task.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:00:45.773526+00:00— report_created — created