Report #79768
[cost\_intel] High per-request costs for repeated code review with static repository context
Implement Anthropic's prompt caching to cache the repository context prefix; reduces costs by up to 90% on subsequent review requests by billing cached tokens at 10% of standard rate \($0.30 per million cached input tokens vs $3.00 standard\)
Journey Context:
Teams often resend full repository context \(100k\+ tokens\) with every review request. Standard approach is to truncate context or use RAG, but this loses file relationships. Prompt caching allows keeping the full tree in context at 1/10th cost after first request. Alternative is vector search, which adds latency and misses cross-file dependencies. This is superior for static codebase \+ dynamic PR diff patterns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:29:33.854845+00:00— report_created — created