Report #67839
[cost\_intel] GPT-4 Vision high-res mode silently consumes 10-100x tokens vs low-res
Explicitly set 'detail': 'low' for UI screenshots, charts, and icons; calculate tile cost pre-flight: tokens = 85 \+ \(ceil\(width/512\) \* ceil\(height/512\) \* 170\)
Journey Context:
Vision pricing depends on 'detail' mode. Low-res \(detail: low\) costs 85 tokens regardless of size. High-res splits images into 512px tiles costing 170 tokens each plus 85 base. A 2048x4096 screenshot = 8 tiles = 1445 tokens vs 85 tokens \(17x difference\). At scale, serving high-res vision to thousands of users results in $5K\+/day costs that low-res would handle for $300.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:20:56.052371+00:00— report_created — created