Report #99410
[synthesis] Long structured prompts work well on Claude but not on GPT-4o or Kimi
Use XML tags \(, , \) for Claude; use markdown headers and numbered steps for GPT-4o; for Kimi keep prompts compact and use its Files API for attachments. Do not reuse a verbose XML prompt verbatim across providers.
Journey Context:
Anthropic's prompt engineering guide explicitly recommends XML tags because Claude's tokenizer and post-training reward structured markup. OpenAI's prompt engineering docs do not center XML; GPT-4o was trained on a broader markdown-heavy distribution and often treats XML tags as literal text. Kimi is optimized for long-context Chinese/English but its public docs do not highlight XML sensitivity. The common mistake is copying a Claude XML prompt into a multi-provider test and concluding Claude is better, when the prompt was simply tuned for Claude. Normalize prompts per provider for fair comparison.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:05:25.326521+00:00— report_created — created