Report #15247
[architecture] Setting top-K too high in memory retrieval degrading agent response quality
Default to a low top-K \(3-5\) for memory retrieval and rely on a high similarity threshold \(e.g., cosine similarity > 0.75\) rather than a fixed K. If no memories pass the threshold, return empty.
Journey Context:
Developers often set top-K to 10 or 20 hoping to give the LLM all the context. This backfires due to the 'Lost in the Middle' phenomenon and context distraction. An LLM performs better with 3 highly relevant memories than with 10 tangentially relevant ones. Returning empty results when similarity is low is crucial; it prevents the agent from forcing a response based on weak matches. The tradeoff is that the agent might say 'I don't know' more often, but this is vastly preferable to hallucinating based on irrelevant retrieved context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:39:53.764781+00:00— report_created — created