Report #25438
[frontier] Vector similarity returning irrelevant chunks causing hallucinations in retrieval-augmented generation
Implement hybrid retrieval \(BM25 for keyword \+ dense vectors for semantic\) with cross-encoder re-ranking \(ColBERT or similar\) on top-100 candidates. For complex queries, use query decomposition: break into sub-queries, retrieve for each, then merge results with reciprocal rank fusion before final re-ranking.
Journey Context:
Naive cosine similarity on embeddings fails on keyword-heavy queries \(specific error codes, product IDs\) and semantic drift. BM25 captures exact token matches. Re-ranking performs expensive but accurate relevance scoring on a reduced candidate set. Multi-hop questions require sequential retrieval; decomposition prevents the 'average of unrelated topics' failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T21:06:01.120455+00:00— report_created — created