Report #1550
[agent\_craft] RAG pipeline retrieves whole files or massive chunks, overwhelming the context window with irrelevant code
Route retrieval to an AST-aware code search that returns only specific function definitions or class signatures, and only escalate to full-file retrieval if the agent explicitly requests the implementation body.
Journey Context:
Embedding-based RAG on raw text often returns large chunks where the relevant code is only 5 lines. Loading a 300-line file wastes context and dilutes the signal. By indexing at the symbol level \(functions/classes\), the agent gets high-density context. If it needs to debug the implementation, it can make a targeted read call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T02:31:24.840137+00:00— report_created — created