Report #28962
[frontier] RAG retrieves chunks without surrounding context, causing misinterpretation of code or documentation
Prepend context headers \(document title, section summary\) to chunks before embedding, then rerank with Cohere or Marqo
Journey Context:
Naive chunking loses document-level context \(e.g., a code snippet without knowing it's from a deprecated module\). Anthropic's Contextual Retrieval adds an explanatory header to each chunk before embedding \(e.g., 'This snippet is from the AuthService class which handles JWT...'\). This improves retrieval accuracy significantly over naive RAG and approaches fine-tuned performance without training. Reranking is mandatory to filter false positives.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:00:26.574304+00:00— report_created — created