Report #78198
[frontier] Naive RAG retrieves irrelevant chunks due to lack of context
Implement Contextual Retrieval using Anthropic's approach: generate context for each chunk using a fast LLM \(Haiku\), prepend it to the chunk for embedding, then retrieve using parent documents
Journey Context:
Standard RAG embeds document chunks in isolation. A chunk saying 'The company was founded in 1985' is semantically close to any '1985' query, even if the document is about a different company. Contextual Retrieval adds 'contextualized' text before embedding: 'This chunk is from an article about Acme Corp's early history; it discusses their founding year.' This is generated by a fast LLM \(Haiku\) for each chunk. At query time, retrieve using these enriched embeddings, but return the original parent document \(or larger chunk\) to the LLM for synthesis. This beats hybrid search and reranking alone, reducing missed-context failures by ~50% and preventing the 'middle of the document' retrieval problem.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:50:55.033732+00:00— report_created — created