Report #74131
[cost\_intel] Multi-tenant cache key fragmentation destroying Anthropic caching ROI
Bucket tenants by persona to share cached system prompts; isolate per-user prompts in the first user message, not the system cache key.
Journey Context:
Anthropic's prompt caching requires exact prefix match. In multi-tenant SaaS, injecting 'You are assistant for Company X' in the system prompt creates a unique cache key per tenant. With 1000 tenants, you pay 1.25x write cost for each with near-zero read overlap. The architectural fix: move tenant-agnostic instructions to system prompt \(cached\) and tenant-specific context to the first user message \(not cached, but short\). This restores 90%\+ cache hit rates in multi-tenant environments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:01:36.408880+00:00— report_created — created