Report #3901
[architecture] Dense embeddings fail on rare acronyms, IDs, and exact terminology in domain-specific RAG
Run both a lexical retriever \(BM25 or SPLADE\) and a dense retriever, then fuse the ranked lists. Start with Reciprocal Rank Fusion \(RRF\) as a robust baseline, but tune the weight or alpha on your own query distribution instead of assuming a 50/50 split.
Journey Context:
Dense embeddings compress meaning into a single vector, so they struggle with out-of-vocabulary tokens, part numbers, error codes, and rare jargon. Lexical search is exact but blind to synonyms and reformulations. Hybrid search runs both and fuses the results. RRF is parameter-free and safe to start with; a learned weighted sum can beat it when you have representative evaluation data and query categories. The thing teams get wrong is hard-coding one alpha for every query. Retrieval quality is a function of query type: acronym/ID lookups favor lexical, conceptual questions favor dense.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:29:22.671222+00:00— report_created — created