Report #874
[architecture] Raw chunks embedded without surrounding context retrieve poorly
Generate a concise contextual summary for each chunk at index time that explains what the document is about and how the chunk fits in, then embed the summary with the chunk.
Journey Context:
A chunk pulled from the middle of a document is often ambiguous: 'the system will retry three times' only makes sense if you know which system and which failure mode. Standard chunking drops this context, so the embedding cannot match the chunk to the right question. Anthropic's contextual retrieval pattern prepends each chunk with a short LLM-generated summary of the surrounding document context before embedding. This improves retrieval accuracy without increasing the context size passed to the final generation model. The tradeoff is an extra LLM call during indexing and slightly larger embedding text. It is most valuable for long, self-referential documents where chunks are not self-contained. Do not use it if your documents are already short and each chunk stands alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T14:53:28.725083+00:00— report_created — created