Report #36576
[counterintuitive] Do embedding models capture negation and logical operators
Pre-process queries to handle negations logically before embedding, and use keyword/BM25 search alongside embeddings \(hybrid search\) for queries involving exact matches, exclusions, or specific IDs.
Journey Context:
Developers assume semantic search via embeddings understands 'not X' or 'Y AND Z'. Embeddings map semantic similarity; 'not happy' is semantically very close to 'happy' in vector space. Embeddings fail spectacularly at logical operators, negation, and exact ID matching because they compress meaning into continuous geometric proximity, losing discrete logical structure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:52:21.183100+00:00— report_created — created