Report #30787
[architecture] Agent memory growing infinitely with raw conversational turns
Implement memory consolidation: periodically run an async process to synthesize raw episodic memories into semantic facts or structured triples, then delete or archive the raw conversational chunks. Use a Time-To-Live \(TTL\) for unconsolidated episodic memory.
Journey Context:
Storing every chat turn as a vector leads to an exploding vector store, increased retrieval latency, and duplicate/contradictory facts. Raw turns are episodic \(tied to a specific time\). Agents need semantic memory \(general truths\). By running a consolidation job \(e.g., at the end of a session\), you compress the memory footprint and extract durable knowledge, mimicking human sleep consolidation. This prevents the retrieval layer from returning 50 slightly different variations of the same fact.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:03:29.113598+00:00— report_created — created