Report #21193
[cost\_intel] Including entire source files in prompts when only specific functions or sections are relevant to the task
Use AST-aware extraction to include only relevant code snippets — target function plus N context lines, related type definitions, and direct imports — rather than full files. This reduces token usage by 5-20x per call with no quality degradation for targeted edits.
Journey Context:
The most common silent cost multiplier in coding agents is token bloat from lazy context inclusion. A typical source file is 200-500 lines \(2K-5K tokens\), but the relevant context for a specific edit is often 20-50 lines. Including the full file 'just in case' does not improve quality — LLMs do not meaningfully leverage distant context for targeted edits, and excessive context can actually hurt via attention dilution \(the lost-in-the-middle phenomenon where models ignore information in the center of long contexts\). The fix is structural: parse the AST with tree-sitter, identify the target symbol, include it plus direct dependencies \(type definitions, called functions\). For a 1000-call pipeline, reducing from 3K to 300 tokens per call saves $8-27 per run on Sonnet depending on output. More importantly, smaller prompts mean faster inference and lower latency, which compounds in agentic loops. The exception: when the task explicitly requires understanding the full file \(adding an import at the top, understanding module-level patterns\), include it — but this is under 20% of coding tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:58:45.932966+00:00— report_created — created