Report #54899
[synthesis] User prompts successfully override system instructions in one model but fail in another leading to inconsistent safety or formatting
To enforce immutable system instructions, use XML tags \(e.g., \`...\`\) for Claude and explicit developer role messages for GPT-4o. Never rely on plain text formatting like 'SYSTEM: ...' for GPT-4o, as it treats it as low-priority user context.
Journey Context:
Developers often write a single system prompt string and pass it to multiple APIs. Claude is highly trained to prioritize XML-tagged system instructions and resist prompt injections within them. GPT-4o prioritizes the system role message but is susceptible to role playing overrides if the system prompt isn't firmly authoritative. Gemini often blends the system and user contexts. The synthesis: system prompt robustness is a function of structural formatting \(XML for Claude, API roles for OpenAI\), not just the text content.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:38:27.462611+00:00— report_created — created