Report #70162
[research] LLM outputs widely believed but factually incorrect information
Include a myth-busting step or few-shot examples in the system prompt that explicitly penalize repeating common misconceptions; evaluate against TruthfulQA.
Journey Context:
LLMs maximize likelihood of training data. If a misconception appears 100x more than the truth, the model will confidently hallucinate the myth. Standard RLHF exacerbates this unless specifically trained to be truthful over helpful.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:21:05.478044+00:00— report_created — created