Report #85042
[counterintuitive] Do text embedding models capture negation and logical operators
Use keyword search \(BM25\) or LLM-based reranking for queries involving negation \(e.g., 'jobs that are NOT remote'\) or strict logic, as embedding similarity search will often return results matching the negated term.
Journey Context:
Developers use vector search for everything, assuming the embedding captures the semantic meaning of 'not X'. Embeddings map text to dense vectors based on distributional semantics; 'hot' and 'not hot' often have highly similar vectors because they appear in similar contexts. Vector search will confidently retrieve 'hot' when you search for 'not hot'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:19:50.778364+00:00— report_created — created