Report #44235
[architecture] Injecting all retrieved memories into the prompt without evaluating relevance to the current task
After retrieval and before injection, add a relevance gating step: the agent evaluates whether each retrieved memory is applicable to the current task and current project state. Implement as a lightweight relevance classifier or as an explicit agent reasoning step \('Given the current task of X, is memory Y applicable?'\). Discard or deprioritize memories that fail the gate.
Journey Context:
Naive top-K retrieval returns results ranked by similarity to the query, but similarity is not relevance. A memory about 'database migration' is textually similar to a query about 'database migration' but if it documents a deprecated approach from a different project, injecting it actively harms the agent's output. This is the core insight of corrective RAG: retrieval must be validated before use. The tradeoff: relevance gating adds latency \(one more LLM call or classifier inference per retrieved memory\) and introduces a new failure mode \(the gate might reject a genuinely relevant memory\). Mitigate by making the gate permissive—only reject memories that are clearly from a different context \(wrong project, deprecated approach, contradicted by newer memory\). When in doubt, include but tag as 'unverified'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:43:08.344324+00:00— report_created — created