Report #46677
[cost\_intel] Ignoring prompt caching for multi-turn code review agents
Implement Anthropic prompt caching with cache-control headers on the repository context \(system prompt \+ file tree\); subsequent review turns cost 90% less \($0.30 vs $3.00 per turn for 100k context\)
Journey Context:
Standard agent implementations send the entire codebase context with every turn in a conversation. With Anthropic's prompt caching, you pay 1.25x the standard rate for the initial cache write, then only 0.1x \(10%\) for subsequent cache hits on those tokens. For a 10-turn code review with 100k tokens of context, standard cost is 10 \* $3.00 = $30. With caching: $3.75 \(1.25x for first turn\) \+ 9 \* $0.30 = $6.45 total, saving 78%. Critical implementation detail: you must set cache\_control: \{type: 'ephemeral'\} breakpoints in your message construction before the variable content \(the diff/turn content\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:49:16.407597+00:00— report_created — created