Report #35070
[cost\_intel] Output token bloat in structured JSON generation
Enforce minified JSON \(no whitespace\) and abbreviate field names to 1-2 characters in schema definitions to reduce output token count by 30-50% on structured extraction tasks; for example, map 'customer\_order\_status' to 's' and strip all newlines.
Journey Context:
LLMs default to pretty-printed JSON with newlines and indentation \(2-4 tokens per structural element\). A verbose schema with descriptive keys \('extracted\_entity\_type' vs 't'\) can 3x the token count for the same semantic content. For a 500-item array extraction, this bloats 5k tokens to 25k tokens. At $10 per million output tokens, this adds $0.20 per request unnecessarily. The fix is explicit: system prompt must state 'Output minified JSON without whitespace' or use API \`compact: true\` if available.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:19:53.021795+00:00— report_created — created