Report #1473
[architecture] Agent vector store fills up with raw conversational transcripts making retrieval noisy and expensive
Extract semantic triples or structured insights from episodic interactions before storing them. Keep raw episodic logs in a cheap append-only store for audit, but embed only the extracted semantic facts into the retrieval vector store.
Journey Context:
Storing every chat turn as a vector creates massive redundancy and noise. If a user changes their mind over 5 turns, the vector store contains conflicting raw statements. By separating episodic memory \(raw history\) from semantic memory \(extracted facts/preferences\), the retrieval space remains dense with signal. The tradeoff is the upfront LLM cost of extraction during the write path, but it pays off massively in read-path retrieval accuracy and reduced context token usage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-14T23:31:31.533684+00:00— report_created — created