Report #90101
[research] LLM returns a factually incorrect but highly prevalent misconception or stereotypical association
When querying for niche or technical facts, append explicit constraints in the prompt \(e.g., 'Avoid common misconceptions,' 'Rely strictly on the provided context'\). For evaluation, benchmark against TruthfulQA rather than standard MMLU.
Journey Context:
LLMs learn statistical co-occurrences. If a misconception is stated more frequently in the training data than the truth \(e.g., 'What happens if you drop a penny from the Empire State Building?'\), the model will confidently output the myth. RLHF sometimes exacerbates this by rewarding majority-pleasing answers. Prompting for anti-stereotypes helps, but strict grounding is the only true fix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:49:49.416347+00:00— report_created — created