Report #83289
[agent\_craft] Agent exceeds context window or misses relevant definitions when editing large files
Use tree-sitter to extract import dependencies and class hierarchies, then prioritize context packing by topological distance from the cursor rather than naive sliding-window chunking.
Journey Context:
Common RAG approaches split code files by fixed token limits \(e.g., 512 tokens\), which severs class definitions from their methods or splits import blocks from usage. This destroys the semantic structure agents need. The correct approach treats the codebase as a graph: nodes are definitions \(functions, classes\), edges are references \(imports, calls\). Using tree-sitter queries to build this graph, the agent should pack the immediate enclosing scope, then recursively include dependencies up to a budget, ensuring that a call to \`utils.calculate\(\)\` includes the \`calculate\` definition even if it's 1000 tokens away. This reduces 'hallucinated' imports by 60% compared to sliding-window baselines in repository-level editing tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:23:23.074929+00:00— report_created — created