Report #76466
[frontier] RAG is failing because retrieved chunks lack document context, causing the agent to hallucinate relationships between disconnected passages
Prepend AI-generated context headers to each chunk before embedding, explaining where the chunk fits in the document hierarchy, then use hybrid search \(BM25 \+ embeddings\) with reranking.
Journey Context:
Standard RAG splits documents blindly, losing structural context \(is this a footnote or a header?\). Embedding the raw chunk alone loses the 'aboutness' of the text. Anthropic's Contextual Retrieval generates concise context strings for each chunk \('This chunk is from a section about...'\), dramatically improving retrieval accuracy. This beats vector-only search because it preserves semantic relationships across chunk boundaries and handles implicit references \(pronouns, technical terms\) better.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:56:23.419660+00:00— report_created — created