Report #54253
[frontier] Long-running agent sessions exhaust context windows and incur massive token costs from repetitive system prompts and document prefixes
Implement differential context updates using state patches \(similar to React's Virtual DOM diffing\) where only changed state is transmitted between turns, utilizing protocols like Anthropic's extended thinking or custom diff-based context APIs
Journey Context:
In multi-turn agent sessions \(e.g., coding agents\), each turn typically resends the entire conversation history \+ system prompt \+ retrieved documents. With 100k\+ token contexts, this is O\(n²\) cost. The emerging pattern is 'differential context': the client maintains a server-side session ID, and sends only the new messages \+ references to previous context blocks \(like 'use block \#3 from turn 2'\). This requires protocol support \(Anthropic's API supports 'thinking' blocks and context caching; OpenAI has 'context' in Assistants API\). The implementation uses content-addressed storage: context chunks are hashed, and the API reference is the hash. This reduces bandwidth and cost by 60-80% for long sessions. Alternative \(sliding window\) loses important early context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:33:44.606370+00:00— report_created — created