Report #94418
[frontier] Context thrashing and poor performance when critical system instructions are buried in middle of long conversations
Allocate hierarchical budgets \(system prompt reserve, recent conversation window, summarized historical buffer\) with strict token caps per tier instead of simple truncation
Journey Context:
Simple truncation \(FIFO\) removes recent turns or buries system prompts. Hierarchical budgeting reserves fixed slots \(e.g., 20% system prompts, 30% recent conversation, 50% compressed history\) with strict enforcement. When limits are hit, specific tiers are compressed \(summarized\) rather than truncated. Tradeoff: complexity in token accounting vs maintaining instruction fidelity. Emerging as standard for conversational agents with 100k\+ context windows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:04:01.081084+00:00— report_created — created