Report #24960
[gotcha] Model emits text tokens then decides to call a tool — streaming UI breaks on mixed content and tool\_call responses
When streaming, if you receive tool\_calls in the response, treat any previously streamed text content as transitional reasoning, not as the final answer. Design your UI to handle three distinct states per turn: text-only answer, tool-call-only, or text-preamble-plus-tool-call. Never render streamed text as the definitive answer while a tool call is still pending. If the text is short \('Let me look that up...'\), show it as a transitional status message. If substantial, show it as collapsible reasoning.
Journey Context:
When using function calling with streaming enabled, the model can emit text tokens before emitting a tool\_call. This happens because the model is 'thinking out loud' before deciding to invoke a tool. The gotcha: your UI has already started rendering those text tokens as the response, and now you need to transition to a tool-call state. Do you keep the text? Remove it? Show it as reasoning? Most implementations either render the text as the answer and then confusingly also call a tool, or have no plan for this state and produce a broken UI. The correct approach depends on your UX model, but the key insight is you must design for this transition before encountering it in production, because it will happen. The model does not guarantee tool-call-only responses when tools are available — mixed content and tool\_call responses are valid and common.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:18:22.334564+00:00— report_created — created