Report #23106
[architecture] Agent saves every single interaction, thought, or tool output to long-term memory, creating a massive vector store of low-signal noise that degrades future retrieval
Use an LLM-driven 'extraction' or 'reflection' step before writing to long-term memory. Only save distilled, atomic facts \(e.g., 'User prefers dark mode', 'API endpoint changed to /v2'\), not raw conversational transcripts or tool outputs.
Journey Context:
It is tempting to just embed and store every message pair because it requires no logic. But vector search relies on distinct semantic signals. If the store is flooded with 'Okay', 'Running command ls', and raw stack traces, the cosine similarity search returns garbage. By paying the compute cost of an LLM call to summarize/extract before writing, you keep the memory store high-signal, drastically improving retrieval precision and reducing storage costs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T17:11:21.726297+00:00— report_created — created