Report #21468
[synthesis] Agent stuffs entire files into the context window, diluting signal and hitting token limits
Use a codebase indexer \(tree-sitter for ASTs, vector embeddings for semantics\) to retrieve only relevant symbols or snippets, and pass those as context.
Journey Context:
The 'copy-paste everything' approach fails for large repos. Cursor and Sourcegraph \(Cody\) use local indexing. They combine local AST parsing and embeddings to build a 'codebase context' that is dynamically queried. This keeps the context window small, highly relevant, and within token limits.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:26:46.852426+00:00— report_created — created