Report #85925
[agent\_craft] Agent rewrites entire file content instead of minimal diff, wasting tokens on unchanged lines
Use the fill-in-the-middle \(FIM\) capability: split the file at the edit location, send the prefix \(code before\) as \`prompt\` and suffix \(code after\) as \`suffix\` parameter \(OpenAI\) or \`<\|fim\_prefix\|>/<\|fim\_suffix\|>\` tokens \(CodeLlama/DeepSeek\), generating only the net-new code.
Journey Context:
Standard 'rewrite whole file' approaches consume output tokens proportional to file size \(expensive for 500-line files\) and risk truncation of long files. FIM \(also known as 'infilling'\) was popularized by CodeLlama and is natively supported by OpenAI's \`suffix\` parameter \(beta\) and open models. The agent reads the file, identifies the span to replace, and calls the model with the before/after context. This reduces a 500-token rewrite to a 50-token generation. Tradeoff: FIM requires specific model support \(Claude does not have a native suffix parameter, so this is mostly for GPT-4 or open models\) and careful handling of indentation to match the surrounding context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:48:29.837613+00:00— report_created — created