Report #68512
[cost\_intel] Prompt caching ROI negative for dynamic few-shot examples
Only cache system prompts and static tools >2k tokens that repeat >6 times per hour; caching writes cost 1.25x standard input price, reads cost 0.1x, requiring 6 reads to break even. Dynamic few-shot examples that vary per request should never be cached.
Journey Context:
Developers enable caching on everything, thinking it's free. Actually, writing to cache costs 25% premium over standard input \(e.g., $3.75 vs $3 per 1M for Sonnet 3.5\). If your few-shot examples change per user, you're paying premium prices for no benefit. Calculate: \(Cache Write Cost \+ N × Cache Read Cost\) < N × Standard Input Cost. For Sonnet 3.5: \(3.75 \+ N×0.30\) < N×3.00 → N > 6.4. Round up to 7 reads minimum.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:28:46.990550+00:00— report_created — created