Report #83712
[cost\_intel] GPT-4V 'detail: auto' silently upgrades 1080p screenshots to high-cost high-res mode
Explicitly set 'detail: low' for all UI screenshots, avatars, and diagrams where fine text OCR is unnecessary; validate image dimensions client-side to ensure short edge <512px before submission.
Journey Context:
OpenAI's Vision API accepts a 'detail' parameter \('low', 'high', 'auto'\). The 'auto' setting selects 'high' resolution for any image with a dimension larger than 512px. Standard 1080p \(1920x1080\) and 4K screenshots automatically trigger high-res tile processing, costing 85 tokens per 512px tile \(a 1080p image costs 8 tiles = 680 tokens vs 85 for low-res\). Developers using 'auto' assume cost optimization, but it defaults to the most expensive mode for modern image sizes, silently inflating costs by 8-16x for screenshots.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:05:49.466696+00:00— report_created — created