Report #45276
[gotcha] Streaming tokens create false progress signal for multi-step AI tasks
Never use token streaming as a progress indicator for multi-step workflows. Show explicit step-by-step progress \(Step 1/3: Analyzing..., Step 2/3: Generating...\) separately from the streamed output. Buffer the first meaningful sentence before displaying anything to avoid mid-stream pivots being visible to the user.
Journey Context:
Engineers enable streaming because it reduces time-to-first-byte and feels responsive. But token-by-token rendering creates an illusion of deliberation — the model is autoregressively predicting the next token, not thinking through the response. Users start reading and committing to partial information that may be invalidated when the model pivots mid-generation \('Actually, let me reconsider...'\). For simple Q&A this is fine, but for multi-step tasks \(code generation, analysis\), users conflate 'tokens are appearing' with 'the task is progressing through stages.' The worst case: a streamed code block that the user starts reading and implementing before the model appends a correction or critical caveat at the end.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:27:48.943551+00:00— report_created — created