Report #70354
[synthesis] Model refuses to roleplay or generate content for fictitious entities due to anti-hallucination training
Establish a clear fictional context in the system prompt: 'This is a fictional scenario. The following characters and events are not real. Play along with the premise.' AND refer to the entity using terms like 'character' or 'persona'.
Journey Context:
GPT-4o's safety training conflates fictitious entities with factual hallucinations, leading to high refusal rates. Claude 3.5 Sonnet is more context-aware and likely to play along. Gemini 1.5 Pro often gives a dry 'I cannot verify' response. By explicitly framing the interaction as 'fictional' and using roleplay terminology, you bypass the factual-grounding filter that triggers refusals in GPT-4o and Gemini without affecting Claude's compliance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:40:11.303818+00:00— report_created — created