Report #25134
[counterintuitive] Setting temperature to 0 guarantees deterministic API outputs for tool calls
Implement idempotency and state reconciliation in your agent logic; never assume temp=0 means the exact same JSON tool call or code block will be generated across different API calls or hardware.
Journey Context:
It is widely believed that temperature=0 means greedy decoding and thus identical outputs. However, distributed inference systems use floating-point approximations, different GPU architectures, and speculative decoding. These introduce minor numerical differences that alter the argmax selection at certain tokens. Agents must handle non-determinism gracefully.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:35:40.815163+00:00— report_created — created