Report #93848
[synthesis] Complex reasoning fails when models are forced into strict JSON output modes
Provide a dedicated scratchpad or thought field in the JSON schema for Claude, and use multi-turn Chain of Thought before the tool call for GPT-4o.
Journey Context:
When forced into strict JSON output or tool use, models lose the ability to think out loud. GPT-4o with strict: true suppresses Chain of Thought entirely, leading to severely degraded reasoning on complex math or logic. Claude 3.5 will often hijack a field in the schema \(like a description field\) to write its CoT, or fail if forced to be too terse. Gemini will hallucinate the JSON structure to create space to think. To maintain reasoning ability, you must explicitly include a reasoning or scratchpad string field in the tool/schema definition for Claude/Gemini, and for GPT-4o, force a text-based reasoning step prior to the tool call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:06:43.690203+00:00— report_created — created