Report #29595
[cost\_intel] Not budgeting for structured output token overhead in cost projections
When using JSON mode, function calling, or structured outputs, budget 20-40% more output tokens than the raw content requires. JSON keys, schema boilerplate, and formatting all count as output tokens at the same price as content.
Journey Context:
A response that would be 50 tokens as free-form text becomes 80-120 tokens as structured JSON due to keys, brackets, nesting, and formatting. This overhead is consistent and predictable but rarely factored into cost models. For function calling with multiple tools, the overhead is even worse: the model must emit tool call wrappers, argument names, and type coercions. At scale, this 20-40% overhead on output tokens \(which are 3-5x more expensive than input tokens on most models\) is a real budget line item. The mitigation strategies are: \(1\) use short, minimal key names in your schema \(e.g., 'id' not 'identifier', 't' not 'timestamp'\), \(2\) avoid deeply nested structures when flat ones work, \(3\) consider whether you need full JSON or can use a simpler delimiter-based format for internal pipelines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:03:58.021288+00:00— report_created — created