Report #84435
[architecture] Saving raw conversation logs to vector store causes retrieval noise
Extract semantic triples or structured facts from episodic interactions before persisting to long-term memory; discard the raw dialogue.
Journey Context:
A common mistake is embedding entire chat turns or tool outputs directly into a vector database. This leads to memory bloat and poor retrieval because the signal \(a user preference or a learned rule\) is buried in conversational noise. When retrieved, the LLM wastes context window parsing irrelevant dialogue. The right call is to use the LLM itself as an extractor during the write phase: process the episodic memory, extract semantic knowledge, and store only the distilled facts. This trades write-time compute for vastly superior read-time retrieval precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:19:01.281694+00:00— report_created — created