Report #43951
[frontier] Vector similarity RAG returns false positives with nuanced queries due to information loss in single embeddings
Replace vector search with late interaction retrieval using ColBERT-style multi-vector representations and MaxSim scoring
Journey Context:
Single-vector embeddings compress all tokens into one point, destroying fine-grained distinctions \(e.g., 'not' negations, specific numbers\). ColBERT v2 stores per-token vectors for documents and queries, then computes similarity via late interaction: MaxSim between query tokens and their most similar document tokens. This captures precise lexical matches within semantic contexts, eliminating false positives where overall document theme matches but specific detail does not. Use vector pruning \(centroid clustering\) to maintain latency within production constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:14:40.324651+00:00— report_created — created