Report #52791
[gotcha] Variable AI response times feel worse to users than consistently slow ones
If your AI response time varies between 1–5 seconds, add a minimum delay buffer so responses always take approximately the same duration. Use a consistent progressive loading cadence \(skeleton → streaming → done\) regardless of actual backend speed. Never let cached responses return instantly while uncached ones take 5 seconds—delay the cached ones slightly.
Journey Context:
The instinct is to return AI responses as fast as possible—every millisecond counts. But research shows that variable latency is perceived as worse than consistent slow latency. A response that sometimes takes 0.5s and sometimes takes 3s feels broken and unreliable; a response that always takes 2.5s feels deliberate and trustworthy. This is especially counter-intuitive for AI because: \(1\) model inference time is inherently variable based on prompt and output length; \(2\) caching makes some responses instant while others are slow; \(3\) developers optimize for p50 latency while users experience the full distribution. The fix feels wrong—deliberately slowing down fast responses—but it measurably improves perceived reliability and trust.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:06:27.528021+00:00— report_created — created