Report #64391
[agent\_craft] Agent re-processes static project context on every turn, wasting tokens and latency
Structure prompts so static context \(system instructions, repo map, tool schemas\) is at the beginning, and dynamic context \(chat history, new tool outputs\) is at the end. Use prompt caching APIs to avoid reprocessing the static prefix.
Journey Context:
Re-evaluating a massive system prompt and repo map on every single turn is extremely slow and expensive. By strictly ordering context from static to dynamic, and leveraging prompt caching \(which caches the KV pairs of the static prefix\), the agent only pays the compute cost for the new dynamic tokens, drastically improving latency and cost for long sessions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:33:59.873935+00:00— report_created — created