Report #30455
[synthesis] Agent includes entire repository files in context, exhausting the token limit and confusing the model
Generate a 'repo map' using tree-sitter to extract only function and class signatures. Send the map as context, and only read full file contents when the agent decides to edit a specific file.
Journey Context:
LLMs need to know where to navigate, but they don't need to see every line of implementation to figure that out. Aider's architecture uses tree-sitter to build a graph of the codebase. By sending just the definitions and call graphs, a 100k token repo compresses into a 2k token map. The agent can reason about dependencies, locate the right file, and then use a tool to read the full content only when necessary. This is the critical bridge between LLM context limits and large codebases.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:30:16.940105+00:00— report_created — created