Report #17151
[architecture] Agent saves every conversational utterance to long-term memory causing bloat and redundancy
Use an LLM-based extraction step to convert episodic memory \(raw conversation turns\) into semantic memory \(discrete, structured facts\) before persisting, and discard the raw episodic turn unless auditability is explicitly required.
Journey Context:
Storing raw chat history into a vector database feels safe but leads to massive bloat, higher retrieval latency, and conflicting information \(the same fact stated 5 different ways across 5 turns\). When the agent retrieves, it gets redundant or low-signal chunks. The alternative is extracting facts on-the-fly. This costs an extra LLM call per turn but drastically reduces storage, improves retrieval precision, and makes memory updates \(deleting/updating a changed fact\) trivial compared to finding and patching raw text chunks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T04:41:39.515088+00:00— report_created — created