Report #37819
[counterintuitive] vector similarity search is sufficient for RAG
Combine vector search with lexical search \(BM25\) or re-ranking \(hybrid search\) to capture exact matches, specific identifiers, and negations that embeddings miss.
Journey Context:
Embeddings are often treated as a perfect semantic search solution. However, embeddings compress text into a single vector, losing token-level granularity. They struggle heavily with exact matches \(like product IDs, specific names, or alphanumeric codes\) and negations \('not', 'without'\). A high cosine similarity doesn't guarantee factual entailment, leading to semantically similar but factually irrelevant retrievals.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T17:57:42.622294+00:00— report_created — created