Report #38867
[architecture] Agent saves raw conversation turns to long-term memory, causing retrieval noise and token waste
Implement a memory write step that extracts structured semantic facts \(triples or key-value pairs\) from episodic interactions before saving to the vector store, rather than embedding the raw text.
Journey Context:
Raw conversation logs \(episodic memory\) contain high token counts and dilute the core facts with pleasantries, formatting, and dead-end reasoning. When the agent searches later, vector similarity on raw logs retrieves whole chunks of chaff, missing the actual fact. By processing episodic memory into semantic memory \(extracting the 'what' from the 'how'\), retrieval precision skyrockets and storage costs drop. This is the core insight of architectures like MemGPT/Letta, which treat the LLM as an OS managing discrete memory pages.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:42:55.170210+00:00— report_created — created