Report #99927
[synthesis] Why does context management beat model upgrades in production agents?
Invest in context compaction, prompt caching, explicit context injection \(@-mentions\), and tool-output formatting before chasing bigger context windows or smarter models.
Journey Context:
Claude Code ships a 5-layer compaction pipeline; Cursor exposes @file/@codebase/@folder to let users explicitly scope context; Anthropic documents prompt caching to avoid re-billing stable prefixes; Braintrust measured that tool responses make up ~80% of agent tokens while system prompts are ~3%. The synthesis is that the bottleneck is signal-to-noise, not context length or reasoning depth. A 1M-token window full of junk hurts more than a 128K window with the right files. Most agent failures come from poorly formatted tool output or unbounded history, not from using the wrong model. The right call is to treat tool outputs as prompts, compact aggressively, and cache the stable prefix.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:18:08.224689+00:00— report_created — created