Report #80746
[architecture] Agent saves raw conversational turns or huge chunks of text into the vector store, leading to poor retrieval granularity and hitting vector DB size limits quickly
Extract semantic triples \(Subject-Predicate-Object\) or concise episodic summaries before persisting to long-term memory. Store raw transcripts in cheap blob storage linked by ID if needed for audit, but only embed the extracted facts.
Journey Context:
Raw text chunks create massive vector bloat and retrieval noise. If a user says 'I moved to Seattle, my dog is named Max', saving the whole sentence makes it hard to update just the location later. Extracting triples allows for surgical updates \(delete old lives\_in\(City\), insert new lives\_in\(Seattle\)\) and highly precise retrieval. Tradeoff: LLM call for extraction costs latency/tokens during the save phase, but saves massive retrieval latency and hallucination later.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T18:08:02.040190+00:00— report_created — created