Report #55150
[agent\_craft] Chain-of-thought reasoning causes the model to rationalize user misconceptions instead of correcting them
Place the CoT requirement AFTER the answer field in the output schema \(JSON\): force the model to output 'answer' first, then 'reasoning', preventing rationalization of a predetermined conclusion
Journey Context:
Standard CoT \('think step by step' at the start\) creates anchoring; the model commits to a path early and defends user premises to be agreeable \(sycophancy\). By structuring the JSON schema to require the final answer before the reasoning block, the model must decide based on its prior knowledge before generating justification. This mimics the 'answer first, explain later' pattern found to reduce conformity in human psychology and has been shown to reduce sycophancy in LLM evaluations by Anthropic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:03:48.922195+00:00— report_created — created