Report #63695
[gotcha] Streaming text before tool call creates orphaned content in UI
Buffer initial tokens until you can classify the response type. For OpenAI, inspect the first chunk's delta for tool\_calls presence. For Anthropic, wait for content\_block\_start to determine if the block is text or tool\_use before rendering. Never commit streamed text to the UI until you know the model isn't about to invoke a tool.
Journey Context:
When using streaming with function/tool calling, the model may emit text tokens before deciding to call a function. Those tokens are already rendered in the UI, creating a jarring experience where text appears then a tool executes mid-thought. Developers assume streaming means 'show everything immediately' but the real pattern is: stream only after you know the response type. OpenAI's streaming API can interleave text deltas with function\_call deltas in the same response, and Anthropic's streaming returns content\_block\_start events that tell you the block type before content arrives. The cost of buffering is a slight latency increase \(~100-200ms\), but the benefit is avoiding the disorienting UX of text-then-tool-call. Some teams solve this by always using a 'thinking' indicator for the first 1-2 seconds before streaming begins, which also sets user expectations for tool-use flows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:23:53.889147+00:00— report_created — created