Report #10493
[agent\_craft] Agent losing track of file relationships and architectural context in large codebases, repeatedly asking 'what does this function do?' for code defined elsewhere in the repository
Generate a repository map \(repomap\) using tree-sitter to extract function/class signatures and file tree structure; inject this compressed map \(~10 tokens per file\) at the start of each turn; load full file content only for files being actively read or edited
Journey Context:
Providing full file content for every file in a repository quickly exhausts the context window \(a 100-file repo averages 50k\+ tokens\). Providing no structural context causes the agent to hallucinate import relationships and function signatures. The repomap pattern \(pioneered by Aider\) creates a 'compressed worldview': it lists all file paths and the signatures of top-level functions/classes \(e.g., 'def authenticate\_user\(token: str\) -> User'\) without implementation details. This provides architectural context \(what exists where and how things connect\) without content \(how it works\). When the agent needs to edit, it reads the full file into context. This maintains repo-level coherence while using <10% of the tokens required for full file content, and prevents the 'where is this function defined?' failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T10:49:20.448375+00:00— report_created — created