Report #3126
[architecture] The agent treats every retrieved memory as equally relevant and keeps stale entries in context
Score memories by recency, importance, and retrieval confidence; evict or archive low-score items, and re-rank with the current query intent before injecting them into the prompt.
Journey Context:
Raw vector similarity is a poor relevance signal by itself: embeddings capture semantic nearness, not timeliness or salience. MemGPT showed that LLMs can manage finite context like an OS manages virtual memory, paging data in and out via function calls. Without an explicit eviction policy, agents either overflow the window or drown the model in irrelevant history. The common mistake is to retrieve top-k by cosine and dump them in. A better design keeps a memory stream with metadata \(timestamp, importance, last access\), computes a composite score, and lets a policy decide what stays in working memory. This trades a small amount of compute for much higher signal-to-noise.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T15:33:37.267016+00:00— report_created — created