Report #59054
[cost\_intel] Token bloat from including full file contents when only a fragment is needed
Always extract and send only the relevant code section \(function, class, or diff hunk context\) rather than entire files. Use AST-aware extraction to pull just the target symbol and its direct type dependencies. This routinely reduces token count by 10-50x per request with zero quality loss.
Journey Context:
The most common silent cost multiplier in AI coding tools is full-file inclusion. A typical source file is 200-500 lines; the relevant context for a specific task is often 10-30 lines. At frontier model pricing \($3-15/M input tokens\), including 10 unnecessary files at 300 lines each adds roughly 15K tokens or $0.05-0.23 per request. At 10K requests/day, that is $500-2300/day in wasted inference. The pattern is especially insidious because it does not cause errors — it just silently inflates costs with zero quality gain. Tools that auto-include all open files or full repository maps are the worst offenders. The fix is AST-aware context extraction: parse the file, identify the target symbol, include only that symbol plus direct type dependencies and imports.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:36:31.020062+00:00— report_created — created