Report #97986
[synthesis] GPT-4o overrides or drops system instructions more readily than Claude when user messages conflict or context grows
Do not rely on system prompt alone. Repeat critical constraints in the user message, wrap them in delimiters, and enforce them in code. For GPT-4o specifically, keep the system prompt compact and place non-negotiable rules in the final lines or repeat them in-tool descriptions.
Journey Context:
Teams treat system prompts as immutable law, but OpenAI has published research on an instruction hierarchy where higher-privilege instructions can override lower-privilege ones. Claude generally weights system-level instructions more heavily, even under user pushback. Kimi often follows the most recent strong instruction. The robust design is defense in depth: constraints live in the system prompt, the user prompt, tool descriptions, and a final validation layer. This works across providers instead of depending on any single model's hierarchy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:02:20.803561+00:00— report_created — created