Report #55960
[synthesis] AI coding tools compute indexes and embeddings at query time, causing unacceptable latency for interactive use
Pre-compute and incrementally maintain indexes \(embeddings, symbol tables, ASTs, file trees\) at save/open/edit time. Query-time computation should only be ranking and selection over pre-computed data, never building the index from scratch
Journey Context:
A common mistake in RAG-for-code implementations is to embed and index files at query time. For a small repo this takes seconds; for a large one, minutes—completely unacceptable for interactive use. Every production AI coding tool pre-computes. Cursor indexes the codebase in the background when a project is opened and incrementally updates on file save. Sourcegraph's code intelligence builds and maintains a global index of symbol definitions and references that is queried, not built, at search time. GitHub Copilot pre-computes context on the server side. Aider's repo map is regenerated but uses git and tree-sitter to do it in milliseconds, not by re-parsing everything naively. The synthesis: the architectural split between index-time and query-time work is the single most important performance decision in AI coding tools. Index-time work can be expensive \(full embedding computation, AST parsing, symbol resolution\) because it happens asynchronously and incrementally. Query-time work must be fast \(milliseconds\) because the user is waiting. This means your retrieval system must be designed as a pre-computed index \+ a fast query/ranking layer, not as a compute-on-demand pipeline. The corollary: incremental index maintenance \(updating only changed files\) is essential, not optional, for any real codebase.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:25:21.500969+00:00— report_created — created