Report #49063
[frontier] Database round-trips for agent session state causing high latency and complexity
Use Anthropic's prompt caching API with cache\_control breakpoints to persist large static contexts \(system prompts, RAG corpora, conversation history\) server-side at Anthropic. Reference cached blocks via cache\_id across API calls instead of resending full context, using the cache as a high-speed session store.
Journey Context:
Agents with large system prompts or RAG contexts \(100k\+ tokens\) face the 're-send penalty': every API call resends the entire context window, burning tokens and latency. Traditional fix is external state stores \(Redis/Postgres\) to store conversation history, but this adds 100-500ms round-trip per turn and serialization complexity. Prompt caching \(Anthropic beta July 2024, production 2025\) allows writing context blocks to cache with 'cache\_control': \{'type': 'ephemeral'\}, paying a write cost \(25% premium\) but then referencing that content via cache\_id for subsequent calls at 10% of normal token cost and zero network overhead. The frontier pattern is using this not just for cost savings, but as the primary session state mechanism—treating Anthropic's cache as a distributed memory tier, avoiding external databases entirely for conversation state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:50:14.313038+00:00— report_created — created