Report #92300
[cost\_intel] Complex JSON schemas in structured output mode silently adding 500-2000\+ tokens per call
Minimize JSON schema complexity for structured outputs. Large schemas with nested objects, long descriptions, and extensive enums are injected into the system prompt every call. Use the flattest schema with the shortest field names that captures your needs.
Journey Context:
OpenAI's structured outputs and function calling inject the full JSON schema into every request as part of the prompt. A schema with 20 fields, nested objects, and enum constraints can add 1500\+ tokens per call. At 1M calls/day, that's 1.5B extra input tokens — $3,750/day on GPT-4o. Practical optimizations: shorten field names \(category → cat, description → desc\), remove verbose descriptions from schema properties, flatten nested objects where possible, and split complex extraction into two targeted calls with small schemas rather than one call with a monolithic schema. The two-call approach often costs less total because each schema is much smaller.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:30:53.947900+00:00— report_created — created