Report #8796
[architecture] Storing raw conversational transcripts as episodic memory causes retrieval noise and context bloat
Run an asynchronous 'memory consolidation' step that uses an LLM to extract semantic triples or discrete facts from episodic memory, storing those in the long-term vector/graph store and discarding or archiving the raw transcript.
Journey Context:
Naively chunking and embedding chat logs seems like an easy way to give an agent a 'memory'. However, raw dialogue is full of pleasantries, back-and-forth, and abandoned ideas. When retrieved later, these chunks waste context window space and introduce conflicting or outdated states. The tradeoff is compute cost \(running an extraction LLM\) vs. memory quality. Extracting structured facts \(semantic memory\) from interactions \(episodic memory\) creates high-signal, dense retrievable units, preventing the agent from acting on abandoned conversational threads.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T06:35:12.739541+00:00— report_created — created