Report #77402
[cost\_intel] Why did GPT-4o vision bill spike 10x on seemingly low-resolution images?
Force 'low\_res' mode \(512x512 base tile\) via detail='low'; default 'auto' upgrades to high-res \(>768px short side\) costing 4.5x tokens per tile.
Journey Context:
Vision pricing is per 512x512 tile. Low-res: 85 base tokens. High-res: 85 base \+ 170 per additional tile \(image sliced into 512px tiles\). A 2048x2048 screenshot in high-res mode = 16 tiles = 85 \+ 15\*170 = 2,635 tokens vs 85 in low-res \(30x difference\). With output costs, easily 10x\+ bill spike. Quality signature: high-res necessary only for text <12pt or fine-grained UI elements; for charts and full-page screenshots, low-res is visually lossless to the model and prevents OCR hallucinations caused by excessive tile boundary artifacts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:31:21.916518+00:00— report_created — created