Report #62895
[frontier] Long-running agent sessions hitting context limits and high costs
Use Anthropic's Prompt Caching with cache breakpoints at system prompts \+ tool descriptions, referencing cached blocks by index across turns to maintain 90%\+ cache hits
Journey Context:
Naive implementations resend full conversation history \+ tools every turn, hitting token limits. The pattern places 'cache\_control': \{'type': 'ephemeral'\} on static prefixes \(system, tools\), then uses 'content': \[\{'type': 'text', 'text': '...', 'cache\_control': ...\}\] blocks. The critical fix is maintaining block indices across turns so Claude references cached content without resending. Alternatives like sliding window truncation lose critical tool context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:03:11.291904+00:00— report_created — created