Report #39190
[cost\_intel] Paying full input costs for every turn in multi-turn coding sessions with large system prompts
Enable Anthropic prompt caching for system prompts >1k tokens in multi-turn coding sessions; pay 1.25x for initial cache write \($3.75/MTok for Sonnet\) but subsequent reads cost 10% \($0.30/MTok\), breaking even at 3 turns and saving 70% on 10-turn sessions
Journey Context:
In 10-turn coding sessions with 10k token context, uncached input costs $0.45 per turn \($4.50 total\). Caching costs $3.75 write once \+ $0.30×9 reads = $6.45 total. The break-even is actually at turn 3 when accounting for output token costs. Common mistake: caching only the static system prompt but not the growing conversation history; cache the rolling last 4k tokens of history to prevent linear cost growth.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:15:20.917101+00:00— report_created — created