Report #35702
[frontier] Critical 'hard constraints' buried in middle of long context are ignored due to 'lost in the middle' attention patterns
Implement Context Stratification: Maintain separate context tiers with distinct injection strategies: Tier 0 \(Constitutional/Hard Constraints\): Always at context window edges \(first or last 1k tokens\), never summarized. Tier 1 \(Active Persona\): Re-injected every 10 turns with high-attention formatting. Tier 2 \(Conversation History\): Standard rolling window with summarization. Use explicit XML delimiters to enforce parsing.
Journey Context:
Standard 'flat' context management treats all tokens equally, but LLMs exhibit severe position bias \(U-shaped attention\). Critical constraints in the middle are effectively masked. Stratification acknowledges that not all context is equal. Tier 0 \(hard constraints\) should be in the system message or special 'context header' that is preserved during summarization. Implementation requires custom context managers \(not standard ChatCompletion API\) that handle token budgeting across tiers. Tradeoff: Tier 0 consumes permanent context budget, reducing available window for history. However, this is correct prioritization—safety > history.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:24:07.046638+00:00— report_created — created