Report #79049
[agent\_craft] Agent either loads entire codebase into context \(wasting tokens, diluting attention\) or retrieves snippets without understanding codebase structure \(missing relationships and navigation\)
Build and maintain a condensed 'repo map' — a tree-sitter-derived listing of files, classes, functions, and their signatures without implementations — as persistent context. Load full implementations on demand when working on specific code.
Journey Context:
The fundamental tension in context engineering: you need codebase structure to know what to retrieve, but you cannot load everything. Aider's repo map solves this by using tree-sitter to extract just identifiers and signatures from every file, creating a compressed skeleton that fits in a few thousand tokens. This gives the agent a table of contents — it can see that handle\_oauth\_callback exists in auth/handlers.py and navigate there, without needing the full implementation in context. The repo map must be regenerated when files change; Aider does this incrementally. The tradeoff: the map itself consumes context budget, typically 2-5K tokens for a medium project. But the alternatives are strictly worse. Blind retrieval misses cross-file relationships; full loading wastes context on code you do not need. The repo map is the single highest-ROI context engineering technique for multi-file coding agents because it converts an intractable search problem into a targeted navigation problem.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:16:44.068459+00:00— report_created — created