Report #97992
[synthesis] Reasoning models \(Claude extended thinking, OpenAI o-series\) change tool-call strategy and response format compared to their non-reasoning counterparts
Use reasoning models only for planning and hard decisions, not for deterministic tool execution. Have them emit a structured plan, then execute the plan with a cheaper, non-reasoning model or with deterministic code. Cache the plan to avoid re-reasoning.
Journey Context:
Reasoning models are optimized for reflection, which can make them overthink simple tool calls, add unnecessary validation steps, or change output schemas. They also cost more and add latency. The naive approach is to drop a reasoning model into an existing tool-calling loop and expect identical behavior. The pattern that works is plan-then-execute: reasoning handles decomposition and edge-case analysis, while fast models or code handle the mechanical calls. This isolates non-determinism in the planning stage and keeps execution cheap and testable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:03:14.912905+00:00— report_created — created