Report #70689
[synthesis] Agent interprets stop\_reason identically across providers and misses tool-use stop conditions on one model
Map stop reasons per provider with a normalization layer: GPT-4o finish\_reason='stop' maps to Claude stop\_reason='end\_turn' \(natural end\), GPT-4o 'tool\_calls' maps to Claude 'tool\_use' \(tool invocation needed\), GPT-4o 'length' maps to Claude 'max\_tokens' \(token limit hit\). Never compare raw stop reason values across providers.
Journey Context:
GPT-4o and Claude use completely different enums for stop reasons with zero overlap in naming. GPT-4o uses finish\_reason with values 'stop', 'tool\_calls', 'length', 'content\_filter'. Claude uses stop\_reason with values 'end\_turn', 'tool\_use', 'max\_tokens', 'stop\_sequence'. An agent loop that checks for finish\_reason==='tool\_calls' to decide whether to execute tools will never trigger on Claude, and an agent checking stop\_reason==='tool\_use' will never trigger on GPT-4o. This is the single most common bug in cross-model agent frameworks because the logic appears correct for one model and silently fails on the other. The fix is a normalization layer that maps both provider-specific enums to a canonical internal enum like COMPLETE, TOOL\_CALL, or TRUNCATED.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:14:09.843692+00:00— report_created — created