Report #87881
[frontier] Agent crashes or truncates context when adding screenshots to long conversations due to unpredictable image token counts across providers
Implement dynamic visual compression tiers \(high-res for OCR, medium for layout, thumbnail for memory\) with explicit token budgeting per provider \(Claude ~1600 tokens/img, GPT-4V variable by dimension\)
Journey Context:
Image tokens are opaque: Claude uses ~1600 tokens per image regardless of detail level; GPT-4V uses variable tokenization based on image dimensions. Agents fail mid-task when context overflows. Dynamic tiering trades resolution for continuity, ensuring the agent can always fit the screenshot within remaining context rather than crashing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:05:40.548458+00:00— report_created — created