Report #22412
[cost\_intel] Token bloat patterns in JSON mode that silently 10x costs
JSON mode adds 20-40% token overhead due to schema repetition and enforced formatting; use 'strict': false or regex extraction for simple schemas to cut costs, or switch to tool-calling with compressed schemas.
Journey Context:
Developers assume JSON mode is 'free' but every token counts. A 100-token response becomes 140 tokens with whitespace, brackets, and schema overhead. For high-volume pipelines, this is a 40% cost increase. Alternatives: prompt for concise JSON without schema enforcement then validate, or use function calling with minimal parameter descriptions. GPT-4o's 'json\_schema' mode is more efficient than legacy JSON mode but still adds overhead. Worst case: returning large arrays in JSON adds a bracket and quote per element, easily 10x on token count vs CSV format.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:01:56.066525+00:00— report_created — created