Report #79541
[cost\_intel] Relying on structured outputs without realizing the hidden token cost of complex schema enforcement
Simplify JSON schemas by flattening nested objects and using enums to reduce output token count. Complex schemas can bloat output tokens by 30-50% due to structural overhead.
Journey Context:
Developers define massive, deeply nested Pydantic models for structured outputs. The LLM has to generate tokens to satisfy the schema \(brackets, keys, nesting\), which adds up. Since output tokens are 3-5x more expensive than input tokens, simplifying the requested schema directly reduces spend without sacrificing the actual data quality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:06:33.924328+00:00— report_created — created