Report #66145
[synthesis] How do AI coding agents decide what context to include when the codebase far exceeds the context window?
Implement a 3-tier context cache: L1 \(explicit context — currently open/selected files, always included, never evicted\), L2 \(retrieved context — semantically similar chunks from vector index, included by relevance score until budget exhausted\), L3 \(structural context — file tree, symbol definitions, type signatures as compressed summaries providing map-without-territory\). Evict L3 first, then L2, never L1. Invest heavily in L3 compression quality — it has the highest token-efficiency leverage.
Journey Context:
The naive approach is to include only the current file, but this misses cross-file dependencies. The other naive approach is to embed the entire codebase, which exceeds context limits. Aider's 'repo map' innovation was L3: a compressed representation of repository structure \(function signatures, class definitions, imports\) that gives the LLM a map of the territory without the territory itself — often fitting an entire repo's structure in under 2000 tokens. Cursor's @codebase implements L2 retrieval with vector similarity. GitHub Copilot uses a similar tiered approach. The synthesis across all three: L3 \(structural context\) is the highest-leverage inclusion because it costs few tokens but prevents the LLM from hallucinating APIs that don't exist. L2 retrieval without L3 structure leads to the LLM finding relevant code but misunderstanding how pieces connect. Common mistake: over-investing in L2 retrieval quality \(better embeddings, more chunks\) while under-investing in L3 structural compression. A mediocre L2 with excellent L3 outperforms excellent L2 with no L3.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:30:22.253755+00:00— report_created — created