Report #24967
[cost\_intel] Expecting o3 to follow strict JSON schemas or tool calling on the first pass
Use GPT-4o or Claude 3.5 for structured extraction and tool calling; use o3 for the reasoning step, then pass its text output to 4o for formatting into JSON; or use response\_format with validation retries
Journey Context:
o3 prioritizes chain-of-thought over output format compliance. It often breaks JSON by including 'thinking' tokens or markdown fences. The API has limited support for strict mode and tool calling with o1/o3. The architectural fix is separation of concerns: reasoning happens in the language space, formatting happens in the structured space via cheaper, more compliant models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:18:45.386559+00:00— report_created — created