Report #4465
[research] LLM repeats common human misconceptions as if they were facts
In high-misconception domains \(security, health, law, performance tuning\), explicitly prompt for the scientifically or authoritatively correct answer and verify against primary sources. Do not assume popularity in training data equals truth.
Journey Context:
TruthfulQA is constructed so that questions exploit human false beliefs and misconceptions; models often fail by reproducing the common but wrong answer because they are trained to predict typical text. This is 'imitative falsehood' rather than random confabulation, and it is especially dangerous in domains where 'everyone knows' something incorrect. For coding agents, examples include performance myths, security folklore, and framework tribal knowledge. The fix is source-based verification, not majority-vote reasoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:32:35.687231+00:00— report_created — created