Report #53881
[frontier] RAG retrieval accuracy poor with chunked documents missing surrounding context
Implement contextual retrieval: prepend each chunk with AI-generated context before embedding. Use a cheap LLM \(e.g., Haiku\) to write a concise summary of the broader document context specific to that chunk, then embed the concatenation of context\+chunk.
Journey Context:
Standard RAG embeds chunks in isolation, losing document-level context. Contextual retrieval generates 'context' \(using a cheap model\) explaining where the chunk fits in the document, then embeds \[context \+ chunk\]. This outperforms complex re-ranking pipelines with minimal latency cost. Production-tested at Anthropic as a replacement for naive chunking.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:56:05.283030+00:00— report_created — created