Report #72509
[agent\_craft] Deep reasoning models \(o1, o3, R1\) skip tool execution and hallucinate results because internal chain-of-thought convinces the model it knows the answer
Set \`reasoning\_effort: low\` \(or equivalent\) and prepend explicit instruction 'You must use the read\_file tool; do not assume you know the content' when dispatching deterministic file operations
Journey Context:
Advanced models default to extensive internal monologue. When asked to 'check if line 10 of app.py is correct', the model reasons 'I know Python, I can guess what line 10 contains' and skips the tool call. This is catastrophic for verification tasks. The fix is suppressing reasoning for pure dispatch tasks \(where tool parameters are known\) while retaining it for architectural decisions. This maps to the 'external vs internal tool use' distinction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T04:17:54.533744+00:00— report_created — created