Report #99504
[cost\_intel] Vision models bill by fixed image tiles, so cropping or downscaling often does not reduce cost
Use low-resolution mode for classification tasks and only high detail when fine detail is required; pre-crop to the relevant region instead of sending a large image at low detail.
Journey Context:
OpenAI resizes images to 512px tiles and bills per tile. A 1024x1024 image costs 4 tiles whether you send the original or a slightly smaller version. The only levers are the detail setting and the actual dimensions crossing tile boundaries. For OCR or small-object detection you need high detail; for is this a cat you can use low detail and cut token count by ~10x.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:15:13.037421+00:00— report_created — created