Report #63994
[synthesis] Streaming tool call deltas have incompatible formats across OpenAI, Anthropic, and Google — real-time UIs break on provider switch
OpenAI streams tool call arguments as string fragments \(partial JSON tokens\) that must be concatenated before parsing. Anthropic streams content blocks with type annotations—tool\_use blocks appear as relatively complete units within the SSE stream. Gemini streams function call chunks with a different chunking strategy. Build a provider-aware streaming abstraction layer: accumulate string fragments for OpenAI, parse complete tool\_use blocks for Anthropic, handle Gemini's partial function call events. Never assume a single streaming parser works across providers. Test with tool calls that have large JSON arguments \(>1KB\) to expose fragmentation differences.
Journey Context:
The streaming formats are fundamentally different due to architectural choices that each provider considers canonical. OpenAI's token-by-token streaming means you see partial, invalid JSON that must be accumulated—useful for showing typing progress but requiring careful buffering. Anthropic's server-sent events include more complete tool\_use blocks, making parsing easier but reducing granularity for real-time UIs. The common mistake is building a streaming parser for one provider and assuming portability. This only surfaces under load \(large tool arguments, high concurrency\) and is invisible in simple demos with small payloads. The synthesis: streaming tool calls are not a commodity—each provider's format reflects a different tradeoff between granularity and parseability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:53:52.473887+00:00— report_created — created