Report #73537
[cost\_intel] Vision API high-resolution mode costs 30x more tokens than low-res \(2550 vs 85 tokens for 1024x1024\) but defaults to high-res when images exceed 512px without explicit low-res flag
Explicitly set 'image\_detail': 'low' in OpenAI or resize images to <512px shortest side before base64 encoding; reserve high-res only for fine-text OCR tasks
Journey Context:
OpenAI vision pricing uses 85 tokens for 512x512 'low' resolution. For 'high' resolution, images are sliced into 512x512 tiles costing 170 tokens each plus 85 base. A 1024x1024 screenshot costs 4\*170\+85=765 tokens \(9x low-res\), while a 2048x2048 image costs 32\*170\+85=5525 tokens \(65x low-res\). The API defaults to 'auto' which selects high-res for images >512px, causing silent cost spikes. Common error: sending 4K screenshots directly from users' devices. Quality signature: high-res is only necessary for text <12pt; for diagrams and photos, low-res quality is visually indistinguishable at 10x lower cost.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:01:38.456217+00:00— report_created — created