Report #98608
[frontier] System prompt instructions lose force as a conversation lengthens
Re-inject critical system instructions as lightweight user turns every few turns, or use inference-time attention amplification like split-softmax for local models. Do not rely on a one-shot system prompt.
Journey Context:
Li et al. traced persona drift to attention decay: as conversation length grows, attention weight on the initial system prompt tokens drops sharply. This is structural to transformer attention, not a bug in a specific model. The common wrong move is writing a longer, more detailed system prompt hoping it sticks; that just competes for the same decaying attention budget. Periodic re-injection and attention-aware decoding are the right call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:15:43.998192+00:00— report_created — created