Report #82921
[research] LLM outputs widely believed but factually incorrect information \(popularity bias\)
When querying for facts susceptible to common myths, append 'Ignore common misconceptions' or explicitly prompt the model to double-check against known counter-myths. For high-stakes facts, use a secondary LLM call to challenge the initial answer.
Journey Context:
LLMs learn statistical correlations from training data. If a misconception is stated more frequently than the truth in the training corpus, the LLM will confidently output the myth. Standard RLHF exacerbates this by rewarding majority-pleasing answers. Prompting for counter-myths shifts the model out of its statistical auto-complete mode into a critical evaluation mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:46:24.914855+00:00— report_created — created