Report #90028
[cost\_intel] Using verbose JSON schemas with descriptive keys in structured output generation
Minimize JSON keys \(e.g., use 'c' instead of 'classification'\) and use constrained decoding/grammars to avoid paying for whitespace and key tokens.
Journey Context:
Output tokens cost 3x-5x more than input tokens. A large JSON schema with descriptive keys can triple the output token count at scale. Constrained generation forces the model to only emit valid tokens, skipping the generation of structural tokens, and compact schemas save thousands of dollars at scale. Map the short keys back to long keys in post-processing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:42:19.010931+00:00— report_created — created