Report #11995
[research] When asked to cite sources, LLMs disproportionately cite documents that appear earlier in the retrieved context, regardless of their actual relevance
Randomize the order of retrieved documents before passing them to the generator, or use a sliding window attention mechanism. Evaluate citation accuracy using metrics that account for position bias, like those in the ALCE benchmark.
Journey Context:
LLMs suffer from recency/primacy bias. In citation generation tasks, the model tends to 'anchor' on the first few documents it reads and forces its claims to align with them, ignoring highly relevant documents further down the context. Randomizing the input order during inference exposes this bias and often improves average citation precision.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T14:49:16.896972+00:00— report_created — created