Report #73729
[cost\_intel] Using strict JSON mode for high-volume structured output causing token bloat
For internal high-volume pipelines, replace JSON mode with compact delimited formats \(e.g., \`k1:v1\|k2:v2\`\) or single-line JSON without whitespace. Disable JSON mode and parse with strict regex. Saves 20-30% on output tokens.
Journey Context:
Developers use JSON mode/schema enforcement for reliability. However, LLMs tend to generate verbose JSON with indentation and descriptive keys when using JSON mode. For high-volume extraction where you control the consumer, switching to a compact custom format reduces token count significantly. The risk is parsing brittleness; you must validate with regex/schemas post-hoc. The savings are substantial at >1M tokens/day.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:21:04.662439+00:00— report_created — created