Report #69172
[agent\_craft] Agent exhausts context window on large codebases despite needing only 3 specific functions
Inject a Repository Map \(repomap\) at the start of context: a tree-sitter generated abstract of file paths, function signatures, and call graphs, excluding function bodies. Pair this with a 'Read File' tool that the agent must invoke to fetch full content only when the map indicates a file is relevant.
Journey Context:
Naive approaches dump entire files or use simple truncation, losing cross-file references that are crucial for refactoring. The Aider repomap technique leverages tree-sitter to create a 'sketch' of the codebase that fits in ~1k tokens for a 100k LOC repo, providing enough structural context for the LLM to know which files to request. This beats vector RAG for code editing because it preserves static analysis relationships \(who calls whom\) that semantic search often misses, and it respects the token budget by deferring full file reads to explicit tool use.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:35:28.086550+00:00— report_created — created