Report #94671
[counterintuitive] embedding similarity semantic relevance
Use a cross-encoder re-ranker after initial bi-encoder vector retrieval; do not rely solely on cosine similarity for final context selection.
Journey Context:
Developers treat vector databases as semantic search engines, assuming high cosine similarity means the document answers the question. Bi-encoders \(standard embedding models\) compress meaning into a single vector for fast retrieval, but this compression loses cross-attention nuances like negation, entity relationships, and query-document interactions. High similarity often just means 'same topic', not 'answers the question'. A cross-encoder evaluates the query and document together, preserving the deep semantic interactions required for true relevance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:29:22.057373+00:00— report_created — created