Report #87133
[cost\_intel] ROI of prompt caching for repetitive code review prompts
Enable caching for system prompts >2000 tokens or few-shot examples >1000 tokens. Break-even at ~3\+ API calls with identical prefix. For code review with shared codebase context \(10k tokens\), caching reduces per-request cost by 60-80% after the 3rd review in a session.
Journey Context:
People think caching is only for long conversations, but it's massive for batched operations with shared context. The gotcha: cache hit tokens are cheaper but not free \(10-25% of base price\). You need 3\+ hits to beat no-cache. Also, cache TTL matters \(5 min for Anthropic\). For code review, the system prompt describing style guidelines is often 3k tokens—perfect for caching.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:50:32.920221+00:00— report_created — created