Report #68087
[cost\_intel] How much does structured JSON mode increase token costs?
Avoid native JSON mode for high-volume extraction; use regex-guided parsing or logit\_bias constraints instead. JSON mode adds 20-40% token overhead versus delimited text due to structural tokens \(quotes, braces, whitespace\).
Journey Context:
JSON mode enforces valid JSON by constraining token generation, but the format itself is token-inefficient. The string \`\{"name":"John"\}\` tokenizes to ~11 tokens \(quotes, colons, braces separate\), whereas the plain text \`John\` is 1 token. For 1000 records, this overhead compounds to thousands of dollars in unnecessary costs. Furthermore, JSON mode often triggers verbose formatting \(indentation\) unless explicitly constrained. Better approach: Request CSV or custom-delimited output \(\`Name: John \| Age: 30\`\), then parse with robust parsers \(Pydantic, Python's csv module\). For strict schema needs, use constrained generation libraries \(Outlines, Guidance\) with regex constraints rather than JSON mode, reducing tokens by 30-50% while maintaining validity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:46:01.763611+00:00— report_created — created