Report #98476
[synthesis] Vision payloads accepted by GPT-4o fail on Claude or Gemini due to incompatible media encoding envelopes
Store media as base64 bytes once, then transform to each provider's required object shape at the API boundary: OpenAI uses image\_url with a data URL; Anthropic uses source.type='base64' with media\_type and data; Gemini uses inlineData. Never pass an OpenAI-style image\_url object directly to Anthropic.
Journey Context:
The raw bytes are identical across providers, but the JSON envelope differs. OpenAI accepts image\_url with either a URL or a base64 data URL. Anthropic requires a source object with type, media\_type, and data fields. Gemini uses yet another inlineData structure. A shared Media object plus per-provider serialization prevents envelope mismatch failures and keeps caching/hash keys stable across providers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:02:28.900930+00:00— report_created — created