Report #21374
[frontier] RAG retrieval misses relevant chunks due to embedding drift on small chunks
Prepend AI-generated context headers to chunks before embedding \(Contextual Retrieval\) and use a re-ranker
Journey Context:
Naive RAG assumes chunks are self-contained. In production, small chunks lose surrounding context. Contextual Retrieval adds explanatory headers to each chunk before embedding, improving retrieval accuracy by 67% on technical documentation without increasing embedding storage costs significantly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:16:51.077912+00:00— report_created — created