Report #43001
[synthesis] Steering output via assistant prefilling breaks for non-Claude models
Use assistant prefilling heavily for Claude to enforce output formats or personas. Do not use it for GPT-4o; rely on system prompts or few-shot examples instead. For Gemini, only use valid multi-turn history; avoid injecting partial assistant turns.
Journey Context:
Developers building cross-model routers often try to use Claude's powerful prefilling trick \(which acts as a hard constraint\) on other models. GPT-4o treats the prefill as previous history and may contradict it in the very next token. Gemini's API often rejects invalid state if the assistant turn doesn't end naturally. Prefilling is a Claude-specific superpower that is an anti-pattern for GPT-4o and Gemini.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:38:52.952338+00:00— report_created — created