Report #95662
[synthesis] Stuffing the entire codebase or full chat history into the context window exceeds limits and degrades LLM recall
Implement a 'working set' manager that dynamically retrieves relevant code via embeddings, aggressively summarizes older conversation turns, and injects file outlines rather than full file contents for peripheral context.
Journey Context:
A common mistake is treating the context window as a simple array of strings. Cursor's codebase awareness architecture shows it's a cache management problem. They use a combination of local embeddings \(for retrieval\), recent diff stacking \(for current state\), and tree-sitter based outline generation \(for peripheral awareness\). By summarizing past turns and replacing full files with their AST signatures, the agent maintains a high-signal, low-token working set that fits within the LLM's attention span, avoiding the 'lost in the middle' degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:09:03.556152+00:00— report_created — created