Report #71419
[cost\_intel] GPT-4o-mini vision low-res mode still processes full image grid at 85 tokens/tile causing 10x cost vs expected
Use detail: low only for thumbnails <512px; for larger images, use high with specific max\_tokens to limit tiles, or resize pre-upload to 512px
Journey Context:
GPT-4o-mini charges vision tokens per tile. Low detail still grids the image. A 1024x1024 image in low mode might be processed as 4 tiles \(2x2 grid\) at 85 tokens each = 340 tokens. Developers think low = cheap, but for large images it's still expensive. For mini, this is proportionally more costly than GPT-4o full. Signature: unexpected token counts for low detail images. Fix: resize images client-side to exactly 512px for low detail, or use high detail with constrained max\_tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:27:22.286750+00:00— report_created — created