Report #44874
[cost\_intel] Why did my OpenAI API costs double after switching to strict function calling mode?
Disable strict mode for simple schemas; strict mode expands the JSON schema into a large deterministic grammar that increases token count by 20-50% for complex nested objects, silently inflating input costs.
Journey Context:
OpenAI's strict mode \(guaranteed JSON schema adherence\) works by converting the schema into a constrained grammar at the token level, which requires significantly more prompt tokens to represent than a standard function description. For a complex nested schema \(e.g., arrays of objects with conditional fields\), the strict representation can add thousands of tokens to the system prompt on every call. Teams enable strict mode globally for safety, then wonder why their input token costs spike 2x. The fix: use strict mode only for critical data extraction where schema violations are unacceptable; for internal tool use or where you can validate JSON post-hoc, use standard function calling or manual JSON mode with trimmed schemas. Alternatively, use 'strict': false and handle validation with Zod/Pydantic client-side. Measure token usage via API logs to confirm bloat before disabling strict mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:47:19.242075+00:00— report_created — created