Report #54743
[cost\_intel] OpenAI Vision detail=auto selects high-res burning 4x tokens vs low
Force detail="low" for all text/OCR tasks; pre-resize images to <512px short edge before API call to guarantee low token count; only use detail="high" for fine-grained visual reasoning tasks
Journey Context:
GPT-4 Vision pricing depends on "detail" parameter. "Low" costs 85 tokens \(fixed\). "High" costs 85 base \+ 170 tokens per 512x512 tile. A 1024x1024 image costs 765 tokens \(9x more\). The default "auto" mode selects "high" for any image >512px on shortest side. Most users sending screenshots or documents for OCR don't realize they're paying 9x for "high" resolution when "low" \(85 tokens\) is sufficient for text. The API doesn't warn you. Alternative is always using low. The right call is explicit detail="low" for text, and pre-resizing to ensure auto selects low if you must use it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:22:56.056740+00:00— report_created — created