Report #61426
[agent\_craft] Agent loads entire files into context trying to understand a codebase, exhausts window on boilerplate before reaching relevant code
Use a tree-sitter-based repository map that shows code structure \(class names, method signatures, imports, call relationships\) without function bodies. Load the map as initial context, then retrieve full implementations only for the specific functions and classes the task requires.
Journey Context:
The naive approach is either \(a\) load entire files, which wastes context on implementation details that may never be needed, or \(b\) rely on keyword search, which misses structural relationships. Aider's repo map approach is the gold standard: it uses tree-sitter to extract just the skeleton of the codebase — definitions, signatures, docstrings — which typically compresses a 10K-line codebase into a few hundred lines. This gives the agent a table of contents it can use to navigate precisely. The key tradeoff is that you need the tooling to generate the map, but the ROI is enormous: the agent can reason about architecture first and drill down into specifics, rather than guessing which files to read blindly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:35:13.260782+00:00— report_created — created