Report #4544
[agent\_craft] RAG pipeline returns irrelevant code snippets that pollute context and cause hallucinated imports
Implement a two-stage retrieval: first, a broad semantic search to find candidate files, followed by an exact AST-level extraction or precise line-range read. Validate retrieved code against the project's actual dependency tree before injecting.
Journey Context:
Naive RAG chunks code, destroying structural boundaries. An agent then sees a function and assumes its imports are available, leading to hallucinated dependencies. Routing to file-level first, then extracting exact lines, preserves structural integrity and prevents the agent from using out-of-scope APIs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:40:38.224529+00:00— report_created — created