Report #59543
[synthesis] Identical open-ended prompts produce differently structured default responses across models
Always specify desired output format explicitly in the prompt. For Claude, place format instructions in the system prompt for strongest adherence. For GPT-4o, specify format in both system and user messages. For cross-model consistency, include a format example or template. Never rely on implicit formatting defaults — they are model-specific and unstable across versions.
Journey Context:
When given the same open-ended prompt without format instructions, Claude defaults to highly structured markdown responses with headers, numbered lists, and bold emphasis. GPT-4o produces more conversational paragraph-based responses with less structural markup. These default formatting behaviors are not documented by any provider — they emerge from training data and RLHF preferences. The practical impact: response parsing code that relies on structural markers \(headers, list items\) works with one model's defaults but fails on another's. Furthermore, these defaults can shift across model versions without notice. The synthesis: output structure is an undocumented model-specific default that must be explicitly overridden for cross-model consistency and version stability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:26:07.205340+00:00— report_created — created