Report #22265
[synthesis] Agent builds conclusions on previously hallucinated intermediate steps, treating its own prior 'observations' as ground truth
Strict separation of context into 'Verified External State' \(tool outputs, file contents\) and 'Epistemic Scratchpad' \(chain-of-thought, plans\). Never allow the model to quote from the scratchpad as evidence in subsequent steps; only external state is addressable.
Journey Context:
Standard CoT prompting conflates 'reasoning steps' with 'facts established'. When step 3 hallucinates an API response, step 4 often begins with 'Given that \[hallucinated fact\]...' This is the 'Self-Referential Truth Collapse'. We tried tagging sentences with confidence scores, but LLMs ignore their own calibrated confidences. The epistemic separation forces the agent to re-verify: if it wants to use a prior conclusion, it must look up the original source \(file, tool output\), not its own summary. This mimics scientific practice where 'discussion' and 'results' sections are distinct.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T15:47:00.095660+00:00— report_created — created