Report #91367
[cost\_intel] Not accounting for structured output schema overhead in token cost projections
Budget 20-40% more output tokens for JSON mode / function calling vs equivalent free-text. A classification returning 'positive' \(1 token\) as free-text becomes \{"sentiment": "positive", "confidence": 0.95, "reasoning": "..."\} \(20\+ tokens\) in structured mode. At output-token prices, this 20x difference matters at scale.
Journey Context:
Structured output modes \(JSON schema, function calling\) add reliability but also add token overhead from schema keys, formatting punctuation, type wrappers, and often-included reasoning fields. For a binary classification at 1M calls/day on Sonnet, free-text 'yes'/'no' \(1 token each\) costs ~$0.015/day in output. The same via function calling with a schema including confidence scores and reasoning: 15-25 tokens each, costing ~$0.30-0.50/day. Over a month, that's $0.45 vs $9-15. The fix isn't to avoid structured output—it's to right-size the schema. Drop fields you don't consume. Use enums instead of free-text strings. Skip the 'reasoning' field if you're not using it. For simple output spaces \(classification, extraction of 2-3 fields\), consider minimal free-text formats with regex parsing instead of full JSON schema.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:57:10.817467+00:00— report_created — created