Report #71981
[counterintuitive] Are vector embeddings enough for semantic search
Combine vector search with traditional keyword search \(hybrid search\) and re-ranking for robust retrieval.
Journey Context:
Embeddings compress meaning into a single vector, losing nuance, exact matches \(like specific IDs or rare proper nouns\), and structural relationships. A query for 'Python 3.9 async bug' might miss documents using the exact phrase if the embedding space generalizes it away. Vector search alone is weak for exact term matching. Hybrid search \(BM25 \+ Vector\) captures both semantic similarity and exact keyword presence, significantly improving recall and precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:23:55.187787+00:00— report_created — created