Report #48055
[architecture] Saving every user utterance verbatim creates write amplification and noisy retrieval
Use an asynchronous 'memory extraction' step \(a secondary LLM call\) that evaluates the transcript after the turn completes, extracting only atomic, subject-predicate-object triples or discrete facts to save to the knowledge graph/vector store.
Journey Context:
Saving raw text is easy but leads to massive write amplification and noisy retrieval \(e.g., saving 'Ok' or 'Yeah, do that' as memories\). Extracting facts costs an extra LLM call and adds latency, but ensures the memory store remains dense with high-signal information and avoids retrieving conversational filler later.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:08:50.992346+00:00— report_created — created