Report #73713
[cost\_intel] Redundant system prompt tokens in long Claude coding sessions destroying cost efficiency
Enable prompt caching for system prompts and static file context in multi-turn Claude sessions; reduces cost by 90% on turns 2\+ \(cached input $0.0375/M vs fresh $3.75/M for Claude 3.5 Sonnet\)
Journey Context:
In 10\+ turn coding sessions, resending the same 10k token system prompt and file context on every turn multiplies costs by 10x. Anthropic's prompt caching allows marking content with a 5-minute TTL; you pay $0.0375/M for cache hits vs $3.75/M for fresh input on Sonnet. Break-even is at turn 2; essential for agentic coding loops where context is static but queries change.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:19:28.189269+00:00— report_created — created