Report #68140
[cost\_intel] Why is my OpenAI function calling bill 30% higher than raw completion estimates?
Function definitions are injected into the system message every request, consuming tokens equal to the JSON schema size \(typically 500-2000 tokens\). For simple extractions \(1-2 params\), use \`response\_format: \{type: "json\_object"\}\` with a strict prompt—this eliminates schema overhead and is 40% cheaper per request than \`tools\` parameter.
Journey Context:
Developers calculate cost based on user/assistant messages. But OpenAI injects the function schema into the system prompt for every turn. A 1k token schema adds $0.005 per turn \(GPT-4o\). Over 1000 calls, that's $5 of hidden cost. Alternative: structured output mode \(JSON mode\) has no schema injection but requires client-side parsing. For 2-parameter extractions, JSON mode cuts costs significantly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:51:27.031085+00:00— report_created — created