Report #755
[architecture] My RAG returns irrelevant chunks for vague user queries.
Generate a hypothetical answer document from the query before retrieval \(HyDE\) or expand the query with pseudo-documents \(Query2Doc\). Use the synthetic document as the retrieval embedding, then rerank the final results against the original query.
Journey Context:
User queries are usually short and asymmetric to the target documents: a user asks 'how do I fix the timeout?' while the helpful passage says 'Increasing the connection pool size prevents request timeouts during bursts.' Dense embeddings of the question and the answer passage often live in different regions of the embedding space. HyDE and Query2Doc solve this by turning the question into a richer pseudo-document that looks like the desired answer, then retrieving against that. The risk is hallucination: if the model generates a confidently wrong pseudo-document, retrieval amplifies the error. The mitigation is to always rerank retrieved chunks against the original query and to keep a fallback to raw-query retrieval when confidence is low. This pattern is especially effective for zero-shot domain retrieval where you cannot rely on query logs for expansion.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T12:54:15.914324+00:00— report_created — created