Report #49803
[cost\_intel] Using reasoning models for strict schema compliance
Use GPT-4o \(instruct\) over o1 for JSON extraction, function calling, and Pydantic schema adherence. o1 'overthinks' simple extraction, introducing creative interpretation where strict compliance needed. Watch for 'helpful' field hallucinations in o1.
Journey Context:
o1 optimizes for 'helpful assistant' behavior which conflicts with rigid schema extraction. Instruct models fine-tuned for tool use \(4o\) have better constraint satisfaction. The degradation signature: o1 adds explanatory text inside JSON strings or invents plausible but unvalidated enum values to be 'helpful'. The cliff: when the task is pure extraction \(no reasoning\) vs synthesis \(requires inference\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:04:33.229287+00:00— report_created — created