Report #81384
[cost\_intel] Ignoring output token multipliers when generating long code files
Explicitly instruct the model to output only the diff or modified functions, not the entire file. Output tokens cost 3-5x more than input tokens across most providers.
Journey Context:
A common pattern is asking an LLM to rewrite a 1000-line file with a fix. The model outputs 1000 lines. If using Sonnet, input is $3/M, output is $15/M. Generating 30k output tokens \(approx 1000 lines\) costs $0.45. If you ask for just the diff \(maybe 100 lines\), output cost drops to $0.045. A 10x cost saving for the exact same functional result.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:12:06.373280+00:00— report_created — created