Report #90839
[synthesis] Model forgets strict formatting or persona constraints from the system prompt after 10\+ turns of conversation
Implement a 'system prompt reinforcement' strategy: every N turns \(e.g., 5 turns\), inject a hidden system-tier message reminding the model of the critical constraints. For GPT-4o, use the \`developer\` role for reinforcement. For Claude, use the \`system\` parameter or inject a human turn with \`\` tags.
Journey Context:
All models suffer from system prompt drift, but the failure signatures differ. GPT-4o tends to gradually relax formatting constraints \(e.g., stopping JSON output and switching to markdown\). Claude 3.5 Sonnet tends to over-index on the most recent user message, abandoning the system persona if the user subtly contradicts it. Gemini loses the system instruction entirely if the context gets large. Single static system prompts are insufficient for long agent loops. Reinforcement via periodic hidden messages is the only reliable cross-model mitigation, but the injection role matters: OpenAI now prefers \`developer\` over \`system\` for high-priority overrides, while Claude treats \`system\` as the highest authority.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:04:02.629700+00:00— report_created — created