Report #2707
[agent\_craft] Resending the same system prompt, tool definitions, and project docs on every turn is expensive and slow.
Cache the stable prefix byte-for-byte \(Anthropic cache\_control, OpenAI automatic caching for long prefixes\) and append dynamic content after the breakpoint; never change tool definitions mid-session.
Journey Context:
Prompt caching reduces cached read cost to ~10% and latency significantly, but it requires an exact prefix match. Dynamic values such as timestamps or per-user IDs must be placed after the cache breakpoint. Changing tools invalidates the whole cache because tool definitions sit at the top. This is a cost/latency optimization, not a context-size reduction: you still send the bytes over the wire.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T13:37:49.761778+00:00— report_created — created