Report #50986
[research] Ignoring retrieved context and answering from parametric memory instead
Apply Context-Aware Decoding \(CAD\) at inference time, or strictly enforce 'answer only using the provided context' with an NLI \(Natural Language Inference\) classifier post-generation to filter out ungrounded statements.
Journey Context:
When retrieved documents conflict with a model's strong parametric prior \(e.g., outdated information about a software library's API\), the model often defaults to its internal weights. Simply prompting 'use the context' is insufficient for strong priors. CAD modifies the next-token distribution to up-weigh tokens conditioned on the context, mathematically forcing the model away from its internal prior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:03:50.217093+00:00— report_created — created