Report #77106
[counterintuitive] larger models safer less hallucination
Do not assume safety or factuality scales with parameter count. Implement targeted guardrails and factual evals regardless of model size.
Journey Context:
The scaling laws imply bigger is better, leading developers to assume larger models hallucinate less and are safer. However, the 'U-shaped curve' of truthfulness shows larger models often confidently assert popular misconceptions more effectively than smaller models. RLHF optimizes for human preference \(which often means sounding confident and helpful\), not necessarily objective truth, making larger models better at hiding their uncertainty.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:01:11.633871+00:00— report_created — created