Report #86948
[cost\_intel] Using OpenAI function calling for simple single-object structured extraction
JSON mode \(response\_format=\{'type': 'json\_object'\}\) uses 40% fewer output tokens than function calling for schema-compliant extraction, as function calling injects 300-800 tokens of schema description into each prompt and wraps output in metadata. Use JSON mode for single-object extraction; reserve functions for multi-tool orchestration or when strict schema validation is required by the API.
Journey Context:
Developers assume 'structured output = function calling.' But function calling appends the full function schema to every request, and the model outputs a function\_call object wrapper \(additional 20-50 tokens\). For extracting \{name, email\} from text, JSON mode outputs \{"name": "...", "email": "..."\} with no wrapper. Over 1M requests, this is $400 vs $800 at 4o-mini rates. The cliff: JSON mode has lower adherence to complex nested schemas \(e.g., arrays of objects with conditional fields\); validate aggressively with Pydantic and retry with function calling only on validation failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:31:45.458700+00:00— report_created — created