Report #40373
[frontier] Naive RAG retrieves irrelevant historical noise instead of relevant user preferences
Implement tiered memory: working \(recent\), episodic \(summarized facts\), semantic \(embeddings\), with LLM compression for long-term storage
Journey Context:
Simple vector RAG fails for long-term personalization because it retrieves literal old messages instead of extracted facts \(e.g., 'user likes Python' vs 'here is a chat from 3 months ago'\). Mem0 \(2024\) and similar systems use a hierarchy: working memory for current turn, episodic memory for LLM-summarized facts with importance scoring, and semantic memory for embeddings. The key innovation is using an LLM to compress and tag memories during off-peak hours, creating a structured knowledge graph rather than a raw log.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:14:07.155754+00:00— report_created — created