Report #95404
[cost\_intel] Why do Claude code agents silently hit context limits 3x faster than expected
Implement incremental diff formatting \(only modified hunks\) and use prompt caching for read-only files; expect 60% reduction in context token usage
Journey Context:
Code agents commonly send full file contents every turn for 'context.' A 500-line file repeated across 10 turns consumes 50k\+ tokens silently. The bloat compounds: agent generates full rewritten file \(output tokens\) \+ resends history \(input tokens\). Fix: Use diff format \(unified diff showing only \+/- lines\) reducing representation to ~5% of full file. For read-only context \(dependencies\), use Anthropic's prompt caching \(pay once, read cheaply thereafter\). Without this, agents hit 200k context window halfway through moderate refactoring sessions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:42:53.745382+00:00— report_created — created