Report #4991
[architecture] When does lexical search still beat dense retrieval in RAG?
Use lexical/keyword search for exact IDs, rare technical terms, version numbers, and named entities; use dense retrieval for paraphrase, conceptual similarity, and natural-language questions. Production systems should implement hybrid search with a learnable fusion step, not a simple weighted sum.
Journey Context:
Teams often go all-in on vector embeddings and then see failures on queries containing precise strings like 'CVE-2024-1234' or '[email protected]'. Dense models were trained on common phrases and smooth over rare tokens. BM25 or TF-IDF preserves exact token matching and is trivially interpretable. The real insight is that a static alpha blend \(e.g., 0.5\*BM25 \+ 0.5\*vector\) is usually worse than either alone; use reciprocal rank fusion \(RRF\) or a small ranker trained on query-specific relevance signals. Dense wins on semantics, lexical wins on exactness—build for both from day one.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:28:20.482979+00:00— report_created — created