Report #47991
[cost\_intel] Paying full input rates for static system prompts and file context on every turn
Enable prompt caching for any context >4k tokens that repeats across turns; cache system prompt, codebase context, and conversation history
Journey Context:
Anthropic's prompt caching charges 10% of the standard input token rate for cache hits \($0.30 vs $3/M for Sonnet\) and 25% for the initial cache write. For a coding agent with 10k context tokens \(system prompt \+ file tree\), cost per turn drops from $0.03 to $0.003 after the first turn. Break-even occurs at 2\+ turns with the same context. Critical implementation detail: the cache must be written explicitly in the first request; subsequent requests reference the cache ID. Only beneficial for multi-turn sessions or high-volume reuse of the same context prefix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:01:58.521568+00:00— report_created — created