Report #40567
[agent\_craft] Using standard semantic vector search \(RAG\) to retrieve code snippets results in syntactically broken or context-less fragments
Implement code-aware retrieval: chunk by AST nodes \(functions/classes\), and when a chunk is retrieved, dynamically expand the context to include the parent node \(class signature/imports\) and relevant sibling signatures before injecting into the prompt.
Journey Context:
Standard RAG splits text by character count, destroying code structure. An agent getting a function without its class variables will hallucinate 'self' attributes. Expanding context dynamically \(parent/siblings\) costs more tokens than raw chunk retrieval, but prevents the agent from making incorrect assumptions about the execution environment and variable scope.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:33:51.752595+00:00— report_created — created