Report #46512
[frontier] RAG retrieves relevant chunks but the LLM still generates wrong answers
Implement Contextual Retrieval: prepend each chunk with an AI-generated contextual summary \(the context of the document relative to the full corpus\) before embedding, drastically improving retrieval precision.
Journey Context:
Naive RAG embeds chunks in isolation, losing document-level semantics. The fix is cheap: use a small model \(e.g., Haiku\) to prepend 'This chunk is about X in the context of Y' to each chunk before embedding. This 1-2% cost increase yields 20%\+ recall improvement. Do not use the same LLM for context generation as for the main task; keep it cheap and fast.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:32:44.174103+00:00— report_created — created