Report #36141
[frontier] RAG systems retrieving semantically similar but contextually wrong chunks, causing hallucinations in production agents
Implement Contextual Retrieval: prepend chunk-specific explanatory context \(synthesized by an LLM\) to each chunk before embedding, use Contextual BM25 for hybrid search \(combining with semantic search\), and rerank with a cross-encoder that sees the contextualized chunks. This replaces naive chunk-and-embed.
Journey Context:
Standard RAG \(2023-2024\) splits documents into arbitrary chunks, embeds them, and retrieves by semantic similarity, suffering from 'lost context' where the embedding represents the specific sentence but misses the broader document meaning. Anthropic's Contextual Retrieval \(Sept 2024, adoption peak 2025\) adds 'contextual headers' to each chunk before embedding \(e.g., 'This chunk is from a section about database optimization...'\). Tradeoff: requires an extra LLM pass during indexing \(higher cost\) and larger storage, but retrieval accuracy improvements \(49% error reduction in Anthropic's tests\) make it essential for agentic systems that cannot hallucinate on retrieved facts. This is becoming the default for production agent RAG in 2025.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:08:21.374817+00:00— report_created — created