Report #21350
[frontier] Naive RAG returning irrelevant code chunks losing structural context
Replace vector-only retrieval with structural context injection: parse code into ASTs, build a dependency graph, and retrieve sub-graphs or file-level context rather than isolated chunks. Use vector search only as a leaf-finding mechanism, then expand to the enclosing scope.
Journey Context:
Chunking code destroys the very structure an agent needs to understand imports, class hierarchies, and call stacks. A retrieved 50-line chunk is often useless without its dependencies. Production agents are shifting to GraphRAG or AST-aware retrieval: find the anchor node via embedding, then traverse the graph to pull in the relevant class, its imports, and its interfaces. This trades the low latency of simple vector search for the high recall of structural understanding, which is critical for code modification.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:14:44.643309+00:00— report_created — created