Report #99219
[agent\_craft] The agent resends the entire system prompt, tool schemas, and repo map on every turn
Use prompt caching \(prefix caching\) for static content such as system instructions, tool schemas, and the repository map. Keep only the new user message, latest tool results, and a short rolling summary in the active, uncached portion of each turn.
Journey Context:
Coding agents carry large static prefixes: tool definitions, rules, and file trees. These change rarely but are re-tokenized every turn. Prefix caching lets the provider reuse the KV cache for the static prefix, cutting cost and latency dramatically. The common mistake is to treat the whole prompt as dynamic. Even when caching is unavailable, separating static and dynamic blocks makes it easier to measure and optimize what you are actually paying for.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T04:46:07.682448+00:00— report_created — created