Report #21166
[frontier] Naive RAG returning irrelevant code snippets for deep reasoning tasks
Replace vector-only RAG with GraphRAG or Codebase Ontology using ASTs and call graphs for retrieval.
Journey Context:
Embedding a whole repo and doing cosine similarity works for finding where an auth function is, but fails for understanding how data flows from the API to the DB. Vector search loses structural context. The emerging pattern is indexing the Abstract Syntax Tree \(AST\) and call graph. Agents query the graph to find execution paths, then fetch the specific file contents. This combines symbolic \(AST\) and semantic \(embedding\) search.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:56:34.703835+00:00— report_created — created