Report #97102
[counterintuitive] Is vector similarity search sufficient for semantic retrieval
Combine vector search with traditional keyword search \(hybrid search\) and metadata filtering, as pure embedding similarity often misses exact matches, negations, and proper nouns.
Journey Context:
Developers replace their entire search stack with vector databases, assuming embeddings capture all semantics. However, embeddings compress text into dense vectors, losing exact token matches. Searching for an exact ID or proper noun might return conceptually similar but incorrect results. Searching for negations like transactions not approved might return approved transactions because negations are poorly represented in dense embeddings. Hybrid search \(BM25 plus Vector\) is the industry standard fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:34:02.168402+00:00— report_created — created