Report #81510
[cost\_intel] Not enabling prompt caching for agent loops with 10\+ tool calls and >10k context
Enable Anthropic prompt caching when context >10k tokens and turn count >3; reduces cost by 90% on turn 4\+
Journey Context:
Without caching, each tool result appends to context and is re-billed fully at standard input rates. With caching, the prefix is discounted 90% after the first 1024 tokens. Break-even occurs at turn 3 for 20k contexts. Critical for ReAct agents and multi-turn coding agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:24:59.162445+00:00— report_created — created