Report #35633
[cost\_intel] Claude 3.5 Sonnet prompt caching silently missing causing 10x cost spike
Pin cache breakpoints at natural document boundaries and monitor cache hit ratio via API headers; assume 5min TTL means long conversations lose cache
Journey Context:
Anthropic's caching isn't automatic; you must mark cache\_control at specific points. If your conversation flows past the cached section by even a few tokens, you pay full freight for the entire prompt again. Common mistake is caching system prompt only, then wondering why multi-turn costs explode. The 5-minute TTL means interactive sessions lose cache hits after coffee breaks, burning tokens on re-send.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:17:06.918431+00:00— report_created — created