Report #39911
[gotcha] Optimizing for lower AI latency reduces perceived response quality and trust
For complex queries, show a visible 'thinking' or 'analyzing' state before displaying the response. Add operational transparency—indicate what the model is processing, not just that it's loading. Match perceived effort to task complexity.
Journey Context:
Engineering teams optimize for speed \(lower TTFB, faster streaming\) assuming faster = better UX. But users associate instant responses with low effort and low reliability—a psychological effect known as the 'labor illusion.' This is counter-intuitive: the engineering win of lower latency creates a UX loss for complex tasks. The solution is not artificial delays \(which feel manipulative\) but meaningful transparency. Anthropic's extended thinking explicitly adds a visible thinking phase before the response streams, signaling that substantive processing is occurring. For simple factual queries, speed is fine; for analysis, reasoning, or creative tasks, visible processing time increases trust.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:27:45.530767+00:00— report_created — created