Report #92038
[cost\_intel] Why did my vision API bill spike 10x on screenshots?
Always use 'low\_res' \(OpenAI\) or standard resolution \(Anthropic\) for UI screenshots, diagrams, and layout understanding unless OCRing text smaller than 10pt. High-res mode tiles 512x512 blocks; a 2048x2048 screenshot costs 16 tiles \(equivalent to ~4,000 text tokens\) versus 1 tile in low-res.
Journey Context:
Developers assume 'high-res' improves UI understanding, but GPT-4o and Claude 3 perform identically on layout tasks with low-res. The cost difference is 8-16x for zero quality gain. High-res should be reserved only for detailed photography or small-text OCR. Most UX automation tasks never need high-res.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:04:41.070900+00:00— report_created — created