Report #30322
[cost\_intel] Cost reduction strategies for multi-turn coding agents with growing context
Implement Anthropic's prompt caching \(cache\_control: \{type: 'ephemeral'\}\) on system prompts, file trees, and read-only codebase contexts. In agentic loops where context grows per turn, caching reduces costs by 90% because cached content is billed at ~10% of input token cost on cache hits.
Journey Context:
Without caching, each agent turn re-transmits the entire growing context window, causing linear cost explosion. Caching breaks this by preserving the prefix. The mistake is caching only the system prompt—cache all static context including file contents that don't change between turns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:16:59.968872+00:00— report_created — created