Report #43100
[architecture] Old retrieved memories polluting current context window and overriding new instructions
Apply a relevance threshold and cross-encoder reranking before injecting memories into the prompt; never inject memories with higher confidence than the system prompt.
Journey Context:
Just dumping top-k vectors from a DB into the context introduces stale, contradictory, or low-signal info. The LLM then hallucinates or gets confused, prioritizing the injected text over its instructions. You need a strict curation step \(e.g., cross-encoder reranking, thresholding\) between retrieval and injection to ensure only high-signal, currently relevant memories make it into the working context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:48:56.530716+00:00— report_created — created