Report #62701
[frontier] Standard RAG with naive chunking loses contextual information \(surrounding paragraphs, document structure\) leading to poor retrieval quality for agent tasks requiring nuanced understanding of long documents.
Implement Contextual Retrieval as defined by Anthropic: prepend chunk-specific explanatory context \(situational text describing the document and chunk location\) to each chunk before embedding, and use BM25 \+ vector hybrid search to significantly improve retrieval accuracy.
Journey Context:
Simple chunking embeds sentences like 'The company was founded in 2020' without knowing what 'The company' refers to or the document type \(financial report vs blog post\). Contextual Retrieval uses an LLM to generate context for each chunk: ' The company was founded in 2020'. This context is prepended to the chunk before embedding. At query time, this disambiguates similar terms across documents. Combined with hybrid search \(BM25 for keyword matching \+ vector for semantic\), this dramatically outperforms naive RAG. For agents, this means they actually retrieve the correct context for user queries instead of hallucinating based on similar but irrelevant chunks. Tradeoff: increased indexing cost \(must generate context for every chunk using an LLM call\) versus significant retrieval accuracy gains \(Anthropic reported 49% reduction in incorrect answers\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:43:29.227597+00:00— report_created — created