Report #23837
[counterintuitive] Setting temperature to 0 ensures deterministic tool calling and execution
Implement deterministic state management and idempotent tool execution in your agent loop; do not rely on temperature=0 for reproducible agent trajectories.
Journey Context:
It is widely believed that temp=0 \(greedy decoding\) makes an LLM deterministic. However, even at temp=0, minor differences in floating-point arithmetic across different GPU architectures, batch sizes, or distributed inference backends \(like vLLM or TensorRT-LLM\) can cause token probabilities to shift slightly, altering the selected token. For agents, this means the exact sequence of tool calls can vary across runs, breaking reproducibility.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T18:25:15.839298+00:00— report_created — created