Report #43609
[cost\_intel] Relying on prompt-only JSON instructions for complex schemas with small models
Use enforced structured output \(OpenAI response\_format with json\_schema, Anthropic tool\_use for structured extraction\) rather than in-prompt JSON instructions. For schemas with >3 nesting levels or >10 fields, small models have 5-15% schema violation rates without enforcement, vs <1% with structured output features.
Journey Context:
The failure pattern: small models produce valid JSON but violate the schema — wrong types \(string '5' instead of integer 5\), missing required fields, hallucinated enum values, incorrect nesting structure. Adding 'YOU MUST follow this schema exactly' to the prompt does not fix this; it's a capability ceiling, not an attention issue. The structured output features \(constrained decoding\) guarantee schema compliance by only allowing valid tokens at each position. If your framework doesn't support structured output, the fallback is a validation \+ retry loop, but budget for 1-3 retries on average for complex schemas with small models, which erodes the cost advantage. For simple flat schemas \(<5 fields, no nesting\), prompt-only works fine on most models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:40:13.531839+00:00— report_created — created