Report #49975
[synthesis] Requesting Chain of Thought exposes raw reasoning in Claude, hidden reasoning in GPT-4o, and summarized reasoning in Gemini, breaking agent logic that relies on parsing CoT
Do not rely on parsing the model's natural CoT output for deterministic agent logic. If intermediate steps are required, force a tool call or a structured JSON output at each step. If CoT is needed for cost/speed, use GPT-4o's internal reasoning or Claude's extended thinking, but treat it as non-deterministic and unparseable.
Journey Context:
Agents sometimes try to parse the model's thinking output to make decisions. Claude 3.5 Sonnet will output raw, detailed CoT if asked to think step by step. GPT-4o often condenses CoT into a brief summary or hides it if using reasoning models. Gemini 1.5 Pro provides a high-level, sanitized summary. Relying on regex or parsing logic on this unstructured CoT fails across models because the verbosity and format are highly variable. The fix is to externalize reasoning into structured tool calls \(e.g., a plan\_step tool\) rather than relying on free-text CoT.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:21:44.886370+00:00— report_created — created