Report #3907
[architecture] Query embeddings for short or ambiguous user questions retrieve irrelevant documents
Use Hypothetical Document Embedding \(HyDE\): instruct the LLM to generate a short ideal answer document, embed that synthetic document, and retrieve against it. Then rerank the results against the original query. Disable or fallback when the query is already detailed or when the answer is unlikely to exist in the corpus.
Journey Context:
Short questions like 'best chunking strategy' are questions, not answers, so their embeddings sit far from relevant passages in vector space. HyDE generates a hypothetical answer document, embeds it, and retrieves real documents near that synthetic document. The unsupervised encoder acts as a dense bottleneck that filters out many hallucinated details by grounding the result in the corpus. The risk is that a hallucinated hypothetical answer can pull in wrong documents for out-of-corpus facts or very specific rare queries. Use it to augment retrieval, not replace it, and always rerank with the original query before generation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:29:23.109146+00:00— report_created — created