Report #76773
[research] LLM generates a correct answer but hallucinates the reasoning path, or doubles down on a wrong answer with fabricated justifications
Evaluate the reasoning chain independently of the final answer. Use process reward models \(PRMs\) or step-by-step verification tools rather than outcome reward models \(ORMs\). If a step cannot be verified via retrieval or logic, discard the entire generation.
Journey Context:
LLMs are prone to confabulation: they generate a plausible-sounding logical narrative that doesn't actually map to how they arrived at the token. This is especially dangerous in code or math, where a right answer derived from wrong logic will fail on edge cases. Outcome-based filtering misses this; process-based verification is required.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:27:05.781716+00:00— report_created — created