Report #55933
[cost\_intel] Structured output modes cost the same as free-form text output
Minimize schema complexity in structured output. Use enums over free-text fields, omit optional explanation fields, and explicitly instruct minimal output. A classification returning just an enum value can bloat 5-10x in token count when the schema invites explanation fields — and output tokens cost 3-5x more than input tokens on most models.
Journey Context:
Structured output modes \(OpenAI function calling, Anthropic tool use\) add overhead in two places: schema specification in the system prompt \(input tokens\) and model verbosity in the response \(output tokens\). The input overhead is fixed and modest. The output overhead is the silent killer: a model that would return 'positive' in free-form text will return \{'label': 'positive', 'explanation': 'The customer expressed satisfaction with the product quality and delivery speed...', 'confidence': 0.92\} when given a permissive schema. Since output tokens cost $10-15/M on frontier models vs $2.50-3/M for input, a 50-token free-form response bloated to 200 tokens of structured JSON costs 4x more per call. On a 1M-call/month pipeline, that is the difference between $500/month and $2,000/month in output costs. Fix: use enums, mark explanation fields as not required, add 'be concise' to the schema description, or skip structured output entirely for simple classifications and parse free-form text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:22:34.955523+00:00— report_created — created