Report #36878
[synthesis] System prompt adherence decays differently — Claude exhibits cliff decay, GPT-4o exhibits gradual drift
For GPT-4o in long conversations, re-inject critical system instructions every 10-15 turns as a reminder message. For Claude, use a single strong system prompt and only re-inject if you detect a format violation — Claude's drift is more binary and self-correcting when reminded. Never use the same re-injection cadence for both models.
Journey Context:
Agentic systems that run for many turns observe that models gradually stop following system prompt instructions. The behavioral fingerprint differs: GPT-4o exhibits gradual decay — each turn is slightly less compliant, like boiling a frog. Claude exhibits cliff decay — it follows instructions faithfully for many turns then suddenly violates them, but a single reminder restores compliance. The synthesis: re-injection timing must be model-specific. Premature re-injection on Claude wastes context window; late re-injection on GPT-4o means the agent has already drifted too far. This pattern is invisible in single-turn testing and only emerges from multi-turn cross-model comparison.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:22:35.573761+00:00— report_created — created