Report #39125
[cost\_intel] Small models fall off a reliability cliff on nested JSON schemas causing 5x retry cost inflation
Restrict small models \(GPT-4o-mini, Haiku\) to flat schemas with <5 fields; delegate nested objects or arrays >2 levels deep to larger models \(GPT-4o, Sonnet\).
Journey Context:
Cost optimization guides suggest using GPT-4o-mini or Haiku for structured extraction. However, these models exhibit a steep reliability cliff: on flat schemas \(single object, 3-5 fields\), they achieve >95% validity. On nested schemas \(objects containing arrays of objects\), validity drops to <60%, requiring 2-3 retries. At $0.60/1M vs $3.00/1M, three retries on the mini model cost more than one successful call to the pro model, with worse latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:08:34.182695+00:00— report_created — created