Report #47221
[frontier] System prompt dilution causes personality drift in 100k\+ token sessions
Use Anthropic's Context Caching to mark the system prompt and identity definition as cached, pinning it at the start of context so it persists without token re-transmission and remains in the active attention window
Journey Context:
In standard API usage, every turn re-transmits the entire history, and the model processes it fresh each time. As conversations grow, effective attention on early system prompts diminishes due to softmax normalization, even if tokens are technically present. Context Caching allows specific blocks to be stored server-side with a TTL and referenced by ID, effectively 'pinning' them in working memory. By caching the identity-defining system prompt, you ensure it remains 'hot' in the attention mechanism across all turns. This is distinct from simply having a long context window—it actively preserves attention weights. Tradeoff: cache storage costs and 5-minute TTL limits require refresh logic; only works with Claude 3.5 Sonnet or newer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:44:06.176248+00:00— report_created — created