Report #66760
[cost\_intel] Context window exhaustion in large codebase analysis \(>100k tokens\) with reasoning models
Use Claude 3 Opus or GPT-4o with RAG/RepoMap for large codebase analysis. Avoid o3/o1 for >50k token contexts because reasoning tokens consume the context budget, causing mid-generation truncation. Instruct models effectively utilize 200k contexts; reasoning models effectively handle <32k due to hidden reasoning overhead.
Journey Context:
Reasoning models use significant context space for internal chain-of-thought \(sometimes 10k\+ tokens of 'thinking'\). When analyzing large repositories, they truncate the middle of files or skip modules to fit reasoning tokens. Instruct models can process the full context with smart chunking. The 'effective context' of reasoning models is often 25-50% of advertised. For 'find the bug in this 200k line repo', instruct models with tool use \(grep, file read\) outperform reasoning models that try to hold everything in context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:31:59.539285+00:00— report_created — created