Report #47755
[cost\_intel] GPT-4o-mini 10x higher structured output failure rate on complex schemas versus GPT-4o, negating 20x cost savings through retries
Route requests based on schema complexity: flat schemas \(<5 fields\) → mini; nested or >10 fields → GPT-4o; never use mini with 'strict: true' on complex objects
Journey Context:
Mini models save 20x on token costs but exhibit order-of-magnitude higher refusal rates and schema hallucinations on nested objects. The trap is enabling strict mode with mini on complex extraction: two retries cost more than one GPT-4o call with 95% first-pass success. Quality degradation signature is 'null' fields or invented enum values. The cost-optimal frontier is schema-dependent routing: mini only for simple intent classification or single-field extraction; anything requiring JSON with nested arrays demands GPT-4o or Haiku \(for Anthropic\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:37:55.114303+00:00— report_created — created