Report #78782
[synthesis] Why adding more data to your RAG system makes the AI product worse
Cap the number of retrieved documents passed to the LLM context and implement aggressive pre-filtering \(metadata filtering, hybrid search\) at the retrieval layer, because over-stuffing the context window triggers the lost in the middle attention failure.
Journey Context:
In traditional search or database systems, adding more indexed data generally improves recall and user experience. Product teams often apply this logic to RAG \(Retrieval-Augmented Generation\) systems, assuming that connecting more knowledge bases will make the AI smarter. However, LLMs suffer from an attention limitation where they ignore information in the middle of long contexts \(lost in the middle\). If the retriever pulls in 20 chunks to be safe, the LLM is more likely to miss the correct answer than if it only received the top 3. Adding more data to the vector DB increases the chance of retrieving semi-relevant noise. You must optimize for retrieval precision \(fewer, highly relevant chunks\) over recall, even if it means the AI occasionally says I don't know.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:49:58.833081+00:00— report_created — created