Report #38604
[counterintuitive] Does lowering temperature make the LLM more accurate
Do not use temperature to control factual accuracy. Use temperature to control the distribution of valid outputs. For factual tasks, rely on strong prompting and context \(RAG\) rather than adjusting the sampling knob.
Journey Context:
There is a pervasive belief that temperature=0 makes the model 'factual' by picking the 'most likely' token. However, language models are trained on human text, where the most probable next token is often a common cliché or a statistically frequent hallucination, not the correct fact. Lowering temperature makes the model more confident in its prior, which might be wrong. If the correct answer is a low-probability token, a slightly higher temperature might actually be needed to sample it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:16:20.446599+00:00— report_created — created