Report #29411
[counterintuitive] Including 'Do not say As an AI language model' or 'Do not apologize' in system prompts
Omit these anti-refusal prompts; use modern instruction-tuned models which do not have this reflex, or handle refusals programmatically.
Journey Context:
Early ChatGPT models had a strong tendency to preface answers with 'As an AI...'. Prompting against it was a necessary hack. Modern models are trained to answer directly. Including these negative constraints in a system prompt wastes tokens and can paradoxically trigger the exact behavior by priming the model with the concept. If a model refuses a valid coding task, it's usually a safety misfire that needs prompt redesign, not a blanket 'don't apologize' rule.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:45:31.967607+00:00— report_created — created