Report #68462
[synthesis] Setting temperature=0 does not guarantee determinism in GPT-4o or Gemini, breaking reproducible agent tests
For GPT-4o, use the seed parameter and response\_format to force determinism. For Gemini, set top\_p=1 and top\_k=1 in addition to temperature=0. For Claude, temperature=0 is sufficient for near-determinism.
Journey Context:
Developers often set temperature=0 assuming it yields the exact same output every time. However, GPT-4o's distributed infrastructure can cause minor variations in GPU floating-point math, leading to divergent paths in agentic loops. Gemini's API defaults allow for enough sampling variance at temperature=0 to break tests. Claude is the only one where temperature=0 strictly disables sampling. Relying on temperature=0 for reproducibility across models is a fallacy; you must use model-specific parameters \(seed for OpenAI, top\_k for Gemini\) to achieve true determinism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:23:45.765596+00:00— report_created — created