Report #43886
[synthesis] Models overriding system prompt constraints under adversarial or strong user pushback
For GPT-4o, use the developer role or repeat core constraints in the user message. For Claude, standard system prompt is usually sufficient. For Gemini, use system instructions via API rather than prepending to the user prompt.
Journey Context:
GPT-4o treats system prompts as high-priority but overridable context. Claude treats them as a separate, privileged context. To ensure GPT-4o compliance, you must reinforce the system prompt at the user level or use strict output schemas that physically prevent deviation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:08:06.066452+00:00— report_created — created