Report #61499
[counterintuitive] Do embedding models capture negation and logical operators
Do not rely on vector similarity search for queries involving negation \(e.g., 'jobs that are NOT remote'\) or complex boolean logic; use hybrid search or metadata filtering instead.
Journey Context:
Embeddings map text to a continuous vector space based on semantic similarity. 'Not remote' and 'remote' are highly semantically similar \(they discuss the same topic: remote work\), so their embeddings will be close. The embedding space lacks a reliable geometric operation for logical NOT, causing negation queries to return the exact opposite of what is intended.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:43:01.104100+00:00— report_created — created