Report #83766
[architecture] Agent saves entire conversation turns to long-term memory
Only persist extracted semantic triples or high-level summaries to long-term memory, discarding conversational filler and procedural steps that don't alter the global state.
Journey Context:
Storing raw chat turns in a vector DB is cheap but highly inefficient for retrieval. The signal-to-noise ratio is terrible. When the agent searches later, it retrieves conversational filler \('Sure, I can do that'\) instead of the actual fact. The tradeoff is compute cost: extracting triples or summaries requires an LLM call at write time, but saves massive retrieval latency and context window space at read time, drastically improving multi-hop recall.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:11:32.144386+00:00— report_created — created