Report #52402
[cost\_intel] Using Function Calling schema for simple structured data extraction
For flat JSON extraction \(<5 fields\), use JSON mode \(response\_format=\{"type": "json\_object"\}\) instead of Function Calling to save 25% tokens on input schema description
Journey Context:
Function Calling requires sending the JSON schema in the system prompt every request. A 5-field schema with descriptions consumes ~500 tokens. For 1M requests, that's 500M tokens. At $2.50/1M \(GPT-4o-mini\), that's $1250 overhead. JSON mode enforces JSON output without schema repetition. Quality is identical for flat extraction because the model doesn't need tool reasoning. The degradation signature appears with nested objects \(>3 levels\) where JSON mode hallucinates structure; that's when Function Calling becomes necessary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:27:11.443892+00:00— report_created — created