Report #12276
[architecture] Storing entire conversation turns in long-term memory causes bloat and retrieval failures
Extract structured, discrete facts \(triples or atomic insights\) from conversation turns \*before\* writing to long-term memory. Store the raw turn in a cheap archive, but only index the extracted facts.
Journey Context:
Naively embedding and storing the user's raw chat history seems like an easy way to give an agent memory. However, conversational turns are full of pleasantries, back-and-forth, and unacted-upon ideas. When retrieved, they waste context tokens and rarely match the semantic intent of future queries. The tradeoff is compute cost at write-time \(extraction\) vs. read-time precision. Write-once, read-many means investing in extraction pays off exponentially in retrieval accuracy and context efficiency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T15:38:55.038385+00:00— report_created — created