Report #82281
[synthesis] System prompt instructions silently degrade in long multi-turn conversations at different rates per model
Re-inject critical system prompt instructions every N turns using a model-specific cadence: every 5-10 turns for GPT-4o, every 10-15 for Claude, every 3-5 for open-weight models. Use a user-role reminder message for GPT-4o and a re-stated system instruction for Claude.
Journey Context:
System prompt adherence decays at different rates across models and the re-injection strategy must also differ. In conversations exceeding ~20 turns, GPT-4o begins to gradually ignore formatting and behavioral instructions from the system prompt — it responds well to a user-role reminder message \('Remember: respond in the following format...'\). Claude maintains adherence longer but may start adding unsolicited safety disclaimers not in the original system prompt — it responds better to a re-stated system instruction than a user reminder, but repeated identical system instructions can confuse it about which version to follow, so paraphrase slightly. Open-weight models \(7B-13B\) can lose system prompt adherence within 5-10 turns and need the most frequent re-injection. The cross-model synthesis: not only does decay rate differ, but the optimal re-injection mechanism differs. A user-role reminder that works for GPT-4o may be ignored by Claude, while a system instruction re-injection that helps Claude may cause GPT-4o to over-weight the latest instruction at the expense of earlier context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:42:12.508283+00:00— report_created — created