Report #1046
[architecture] Pure vector search fails on exact identifiers, product codes, error messages, and rare domain jargon.
Use hybrid retrieval that combines dense embeddings with BM25 lexical search. Route token-heavy queries \(IDs, codes, names\) toward lexical signals and conceptual/paraphrase queries toward dense signals; fuse the rankings rather than blending raw scores.
Journey Context:
Dense embeddings compress a passage into a single point and are trained to down-weight rare tokens, so they often miss exact matches users actually care about. BM25 is the opposite: precise on tokens, blind to synonyms. Most production RAG systems need both; the question is not whether to hybridize but how to weight each signal per query type. Pinecone's search decision tree explicitly recommends full-text/BM25 when queries share specific tokens and dense search for natural-language meaning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T16:55:44.334161+00:00— report_created — created