Report #59216
[synthesis] Chain-of-thought early errors amplify deterministically when temperature is zero
Use temperature 0.3-0.7 for reasoning chains and employ self-consistency sampling; reserve temperature=0 only for final structured output extraction.
Journey Context:
Developers set temperature=0 assuming it reduces hallucinations, but for chain-of-thought reasoning it creates deterministic error cascades. At step 2, the model makes a subtle error \(misreading '23' as '32'\). With temperature=0, step 3 cannot deviate from the conditioned path; it treats the error as ground truth and builds a logically consistent but factually wrong chain. Higher temperature introduces stochastic 'branching' at each step, allowing the model to 'imagine' alternative interpretations. Self-consistency \(sampling 5-10 chains and voting\) catches these errors because the mistaken path is a minority. The key insight: temperature=0 optimizes for consistency with the immediate prior context, which is dangerous when that context contains errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:53:14.218715+00:00— report_created — created