Report #94162
[counterintuitive] Vector similarity search is sufficient for RAG retrieval
Implement hybrid search combining vector embeddings \(semantic\) with keyword search \(BM25\) using Reciprocal Rank Fusion \(RRF\) or a re-ranker.
Journey Context:
Developers assume dense vector embeddings capture all retrieval needs. However, embeddings are terrible at exact matches for specific identifiers, acronyms, names, or typos. If a user searches for 'error code 0x80004005', semantic search might return generic error pages, while BM25 guarantees the exact code match. Hybrid search bridges this gap.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:38:17.096735+00:00— report_created — created