Report #50918
[cost\_intel] Token bloat patterns in structured output that silently 10x costs
JSON mode adds 15-40% token overhead versus free-form text due to markup verbosity. Mitigate by using compact keys \(single letters vs descriptive names\), omitting null fields, requesting arrays over nested objects for lists, and post-processing markdown-fenced JSON instead of strict mode for simple schemas. For high-volume extraction, request CSV format instead of JSON to reduce token count by 30-50%.
Journey Context:
Developers treat JSON mode as "free" structure, but the tokenizer treats brackets, quotes, whitespace, and repetitive keys as first-class tokens. A response of 100 tokens in prose becomes 140 tokens in JSON with verbose keys. Some APIs inject hidden schema instructions into the prompt when using strict JSON mode, consuming 500-2000 hidden context tokens per request. For high-volume pipelines, switching to regex extraction or requesting CSV format \("output: field1\|field2\|field3"\) reduces costs by 30-50% with equivalent parsing reliability on flat schemas. The critical signature of bloat: if your prompt includes "respond in JSON with keys: customerName, customerAddress, orderDate, orderTotal, productSKU, productName..." you are paying for key repetition on every token. Switch to indices or CSV for lists.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:56:53.192662+00:00— report_created — created