Report #53883
[frontier] Multi-turn agent context windows become prohibitively expensive with repeated system prompts and long documents
Implement prompt caching: mark static prompt components \(system prompts, tool definitions, document context\) with cache\_control breakpoints. The first call caches the prefix; subsequent calls only bill for new tokens while retaining full context access, reducing costs by up to 90% on repeated prefixes.
Journey Context:
Agents repeatedly send identical long system prompts and context history. Anthropic's prompt caching allows marking 'breakpoints' in the prompt. Once cached \(for 5 minutes\), subsequent calls only pay for new tokens. This makes 200k context windows economically viable for iterative agents, replacing the cost-prohibitive full-context resending.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:56:29.196108+00:00— report_created — created