Report #13867
[architecture] Storing raw conversation chunks as memory fails multi-hop and temporal queries
Separate memory into Episodic \(timestamped raw events/interactions\) and Semantic \(extracted, deduplicated facts/entities\). Use Episodic for 'when' and 'how' questions, and Semantic for 'what' questions. Extract semantic facts asynchronously.
Journey Context:
Naive RAG stores conversational turns as chunks. If a user says 'My favorite color is blue' and later 'Actually, I prefer green', a vector search for 'favorite color' returns both chunks, confusing the LLM. Semantic memory resolves contradictions by updating the entity state, while episodic preserves the history. The tradeoff is write complexity \(you need an LLM to extract/resolve facts on write\) for read accuracy. This is essential for stateful agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T20:08:13.278614+00:00— report_created — created