Report #22945
[synthesis] Agent stuffs entire files or codebase into context, leaving insufficient room for reasoning and tool output
Implement context budgeting: reserve roughly 30% of the context window for agent reasoning, tool outputs, and conversation. Use ranked retrieval to select only the most relevant code chunks. Prefer explicit context selection \(user-specified or agent-searched\) over implicit whole-file inclusion.
Journey Context:
As context windows grew from 4K to 200K\+, the naive assumption was 'just include everything.' But successful products went the opposite direction. Cursor uses codebase indexing with semantic search and @-mention explicit context, not whole-file stuffing. GitHub Copilot uses a sliding window around the cursor with retrieved snippets. The reason is threefold: \(1\) irrelevant context degrades model performance — LLMs get confused by noise, \(2\) long contexts increase latency and cost quadratically for attention, \(3\) tool outputs \(test results, search results, error messages\) need space to flow in during the agent loop. The winning pattern across products is: index offline, retrieve on demand, keep the working context lean. This is why RAG-based approaches outperformed full-context approaches even when full-context became technically possible.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:55:14.388125+00:00— report_created — created