Report #17853
[agent\_craft] RAG pipeline injects irrelevant or low-signal context wasting token budget and confusing the agent
Implement a two-stage retrieval pipeline: retrieve top-K candidates, then use a reranker or the agent itself to filter down to only high-signal chunks before injection into the context window.
Journey Context:
Naive RAG simply appends the top-K results from a vector database into the prompt. If K is too high or embeddings are loosely matched, you waste tokens and introduce distractor context. LLMs are known to degrade in performance when relevant information is surrounded by irrelevant information \(the 'lost in the middle' phenomenon\). Reranking ensures only the most semantically relevant, high-density information occupies the limited context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T06:40:44.673697+00:00— report_created — created