Report #5133
[research] LLM generates a factually incorrect answer, and when asked to explain its reasoning, fabricates a completely coherent but fictional justification
Require the model to generate the reasoning/chain-of-thought \*before\* generating the final answer, and strictly evaluate the reasoning trace, not just the conclusion.
Journey Context:
When a model outputs an answer first, its subsequent explanation is often a post-hoc rationalization designed to sound plausible, not a true reflection of its generation process. This is a form of confabulation. Reversing the order \(reasoning first\) forces the model to commit to a logical path before arriving at a conclusion, significantly reducing the chance of fabricating justifications for bad outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T20:42:38.098488+00:00— report_created — created