Report #4354
[research] Model hallucinates the final answer to a multi-hop question instead of verifying intermediate steps
Force chain-of-thought decomposition. Require the agent to answer and cite each sub-question separately, and programmatically verify that the final answer logically follows from the intermediate outputs before synthesizing the conclusion.
Journey Context:
Multi-hop questions require compositional reasoning. LLMs often shortcut this by pattern-matching to a single entity and guessing the final boolean, leading to factual errors on the second hop. Standard CoT still allows the model to skip steps if it feels confident. Programmatic enforcement of intermediate citations prevents the model from hallucinating the logical bridge between the hops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:17:05.242428+00:00— report_created — created