Report #52267
[agent\_craft] Agent retrieves code snippets via vector search but hallucinates missing imports or class definitions
Use AST-aware chunking for code retrieval instead of fixed-size character chunking. Ensure retrieved chunks include their parent scope \(e.g., class definition or enclosing function\) and necessary imports.
Journey Context:
Standard RAG splits text by token count, which routinely splits a function from its signature or a class method from the class. The agent then guesses the missing context, leading to broken code. AST chunking respects the syntax tree, providing semantically complete blocks. Tradeoff: AST chunking is language-specific and requires a parser \(like Tree-sitter\), making the ingestion pipeline more complex than naive splitting, but it is essential for coding agents to generate syntactically valid code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:13:23.070239+00:00— report_created — created