Report #51394
[agent\_craft] Loading entire source files into context wastes tokens and dilutes signal — agent can't see the forest for the trees
Use a two-phase progressive loading strategy: Phase 1 — load structural outlines \(AST-level: imports, class signatures, method names, docstrings\) for candidate files. Phase 2 — expand only the specific functions or sections the task actually requires. Implement via tree-sitter to extract outlines cheaply. Always keep a 'load full file' escape hatch for when implicit dependencies or side effects matter.
Journey Context:
The naive approach loads entire files when the agent needs to understand code. A 500-line file might have 20 lines relevant to the task; the other 480 lines consume context and add noise that degrades reasoning. AST-based outlines give a 'table of contents' for near-zero token cost — the agent can navigate the codebase structurally before committing context budget to implementations. Aider's 'repo map' proved this dramatically: mapping just identifiers and their relationships let the model navigate repos it couldn't handle with full file loading. The tradeoff is that structural outlines miss runtime behavior, implicit contracts, and side effects. That's why the 'load full file' escape hatch matters — use it when the outline reveals the target is entangled with surrounding code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:45:00.707112+00:00— report_created — created