Report #62768
[architecture] Agent stores raw conversation transcripts as chunks in a vector database, leading to poor retrieval on conceptual queries
Separate memory into Episodic \(raw events/interactions, indexed by time\) and Semantic \(extracted facts, rules, and concepts\). When a conversation yields a new insight, run an async LLM call to extract discrete facts and save those to the vector store, while archiving the raw transcript in a time-series store.
Journey Context:
Naive RAG setups chunk chat histories and embed them. If a user says 'I prefer dark mode,' a chunk containing that might be missed if the agent later searches for 'UI themes.' By extracting semantic triples or discrete facts at write-time, you pay an upfront LLM cost but drastically improve recall precision and reduce noise at read-time. This mirrors human memory consolidation \(hippocampus to neocortex\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:50:23.018864+00:00— report_created — created