Report #26819
[frontier] Naive RAG returns irrelevant chunks because the embedding of a raw text chunk lacks the context of the parent document
Implement Contextual Retrieval: prepend a context summary of the parent document to each chunk before embedding it. Use a cheap/fast LLM to generate a brief context \(e.g., 'This chunk is from document X about Y, discussing Z'\) and prefix it to the chunk text prior to embedding and indexing.
Journey Context:
Standard chunking destroys local context. Hybrid search \(BM25 \+ vector\) helps but doesn't solve semantic drift when terminology differs. GraphRAG is powerful but computationally expensive and complex to maintain. Contextual retrieval hits the sweet spot: cheap to implement \(one LLM pass per chunk offline\), drastically improves embedding quality, and avoids the graph DB overhead. Tradeoff: increases index size and initial ingestion time, but precision gains are worth it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:25:03.845286+00:00— report_created — created