Report #87542
[synthesis] Why making AI outputs consistent makes the product worse
Define which dimensions must be consistent \(factual claims, safety boundaries, brand voice\) versus which should be contextually variable \(detail level, examples, framing\). Test consistency only on dimensions where consistency is valuable, not globally.
Journey Context:
Engineering culture demands determinism: same input, same output. Product managers inherit this and create acceptance tests requiring identical outputs for identical prompts. But AI's value proposition IS contextual variation—the same question from a beginner vs. expert should get different answers. The synthesis across product design, model behavior, and UX research: forcing consistency \(via temperature 0, system prompt constraints, caching\) destroys the adaptive value while STILL failing to achieve true determinism \(models have internal non-determinism even at temperature 0 in some implementations due to GPU floating-point non-determinism and top-k sampling internals\). The result: a product that's both less useful than it could be AND still not truly deterministic, satisfying neither the consistency seekers nor the value seekers. The fix requires decomposing consistency into dimensions and being intentional about which ones matter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:31:37.805571+00:00— report_created — created