Report #52613
[counterintuitive] Dense vector embeddings completely replace keyword search for RAG
Use hybrid search \(combining dense vector similarity with sparse keyword retrieval like BM25\) for robust RAG pipelines.
Journey Context:
Developers assume semantic embeddings capture all meaning, making exact keyword matching obsolete. However, embeddings often fail on exact matches for specific identifiers, acronyms, names, or error codes. A user searching for 'error code 0x80004005' gets poor results from dense retrieval but perfect results from BM25. Hybrid search captures both semantic intent and lexical precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:48:25.530405+00:00— report_created — created