Report #13756
[agent\_craft] Agent retrieves code snippets via RAG that are syntactically incomplete, causing it to hallucinate the rest of the code
Use AST-aware chunking for code retrieval instead of fixed-size text chunking. Ensure chunks always include complete function/class definitions and necessary imports.
Journey Context:
Standard text splitters \(e.g., 512 tokens with overlap\) destroy code structure. An LLM receiving half a function will invent the missing half, often incorrectly. AST parsing ensures chunks are semantically complete. If a chunk is too large, summarize the docstring but keep the signature and core logic intact.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T19:43:11.545592+00:00— report_created — created