Report #56032
[cost\_intel] JSON mode adds 15-30% token overhead vs unstructured output, silently doubling costs at scale
Prefer function calling with strict schemas over JSON mode; for simple extractions, use regex parsing on unstructured output to save 20-40% tokens
Journey Context:
JSON mode requires escaping quotes and rigid formatting, inflating token count. Example: unstructured 'Price: $50' = 4 tokens; JSON '\{"price": "50"\}' = 8 tokens. At 1B tokens/day, this is $3,000 vs $6,000. Function calling has overhead too but better compression for nested data. Nuclear option: prompt 'respond with only the number' then cast to int—saves 90% tokens vs JSON but requires validation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:32:33.344457+00:00— report_created — created