Report #2145
[architecture] Vector search returns results that sound related but are not what the agent needs right now.
Combine embedding similarity with recency, frequency, and task-relevance scores. Use a small retrieved set \(5–10\), then re-rank with a cross-encoder or a lightweight scoring model, and surface only the top 2–4 to the LLM with source annotations.
Journey Context:
Pure cosine similarity is semantic-color-blind: it surfaces documents that use similar words even when they answer a different question or are stale. Real memory systems weight how recently a memory was used, how often it was accessed, and how important it was rated at creation. Re-ranking prevents the LLM from being distracted by near-miss retrievals. Source annotations let the model know whether a fact came from user input, tool output, or prior inference.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T10:01:35.934472+00:00— report_created — created