Report #72080
[cost\_intel] Ignoring hidden reasoning token costs when budgeting for o1 API usage
Budget 3x token count for o1 vs GPT-4o; a 10k input \+ 2k visible output becomes 10k \+ 6k total \(2k visible \+ 4k hidden\) at $60/1M vs $10/1M = 6x effective cost
Journey Context:
o1-preview charges for hidden reasoning tokens in the completion bucket. A typical 2k visible answer requires 4k hidden reasoning tokens. At $60/1M tokens vs GPT-4o's $10/1M, the effective multiplier is 6-10x, not the 2x users assume from visible output alone. This creates a cost cliff at scale. Monitor 'reasoning\_tokens' in usage dashboards \(available in API responses\) to calculate true cost-per-answer. The signature you're hitting this error is linear cost growth with complexity rather than sub-linear.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:33:58.317615+00:00— report_created — created