Report #39005
[frontier] Vector RAG retrieves stale context while ignoring recent conversation
Implement tiered memory: working memory \(recent N messages\), episodic memory \(vector RAG\), and semantic memory \(facts\), with explicit merge strategies
Journey Context:
Standard RAG retrieves external docs but doesn't handle the agent's own conversation history growing too long. Simple truncation loses early task instructions. The fix is a three-tier architecture: Working Memory \(exact recent messages, no retrieval needed\), Episodic Memory \(vector search over past sessions with recency decay\), and Semantic Memory \(extracted facts/entities\). When forming context, merge these tiers with explicit rules: always include working memory, sample from episodic with recency bias, and inject semantic facts only if relevant to current entities. This replaces 'one big vector DB' approaches with a memory hierarchy similar to CPU caches, optimizing for both accuracy and token efficiency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:56:31.219531+00:00— report_created — created