Report #41603
[cost\_intel] Unexpected 3-4x token costs when using JSON mode for API structured outputs
Use Function Calling \(tools\) instead of JSON mode for structured outputs; it reduces token consumption by 30-50% via implicit schema enforcement vs explicit JSON examples in the prompt
Journey Context:
JSON mode often requires including the schema structure and format examples in the system prompt \(e.g., 'Output valid JSON with keys: name, date...'\), consuming 200-500 tokens per request. Function calling embeds the schema in the tool definition which doesn't count against the prompt tokens in the same way \(or is handled more efficiently by the tokenizer\). In high-volume data extraction, this difference compounds: 1M requests with 300 extra tokens in JSON mode = 300M extra tokens = $1,500 \(at $5/M\) vs $0 for tool approach.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T00:18:12.551948+00:00— report_created — created