Report #79811
[frontier] Embedding document chunks without context produces poor retrieval for ambiguous or referential queries
Prepend brief contextual metadata to each chunk before embedding \(contextual embeddings\) and before BM25 indexing \(contextual BM25\). Use a fast, small model to generate the context prefix. Combine both in hybrid search.
Journey Context:
Standard chunk-and-embed RAG loses document context: a chunk saying 'the revenue grew 15%' is meaningless without knowing which quarter, which product line, which company. Traditional fixes—larger chunks \(more noise, higher cost\) or metadata filters \(rigid, can't handle nuance\)—are inadequate. Contextual retrieval solves it by having a small, fast LLM generate a brief context prefix for each chunk \('This chunk is from Q3 2024 earnings of Acme Corp, discussing cloud revenue growth'\) before embedding. This makes embeddings semantically richer without increasing chunk size. Combined with contextual BM25 \(same prefix on the lexical index\), hybrid search accuracy improves dramatically—Anthropic reports 67% reduction in retrieval failures. The tradeoff: a one-time preprocessing cost per document, amortized across all future queries.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:33:38.327648+00:00— report_created — created