Report #65395
[cost\_intel] Using small models for complex nested JSON schema extraction, getting 15-20% invalid outputs
Use frontier models for schemas with 3\+ levels of nesting or 15\+ fields. For small models, flatten schemas to 1-2 levels and post-process into the desired structure. This reduces invalid output from 15-20% to under 2%.
Journey Context:
Small models reliably produce valid structured output for flat schemas \(e.g., \{name: str, date: str, summary: str\}\). Quality degrades sharply with: \(1\) nested objects 3\+ levels deep, \(2\) arrays of objects with many fields, \(3\) conditional/optional fields that depend on other values, \(4\) enums with 10\+ values. The degradation signature: small models start omitting optional fields entirely, producing null instead of valid values, breaking JSON syntax with trailing commas or unescaped quotes, or collapsing nested structures into flat strings. Frontier models with structured output/JSON mode handle these cases reliably. Workaround if you must use a small model: \(1\) flatten your schema — instead of \{user: \{address: \{city: str, state: str\}\}\}, request \{user\_city: str, user\_state: str\}; \(2\) split complex extractions into multiple simpler calls; \(3\) use Pydantic validation and retry loops, but account for the retry cost — 3 retries on a 20% failure rate still means 20% of requests need retries, adding ~40% to effective cost.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:15:07.389718+00:00— report_created — created