Report #67973
[synthesis] Same prompt refused by one model and completed by another, causing inconsistent agent behavior across providers
Implement a fallback chain: if the primary model returns a refusal, retry with a rephrased prompt once, then fall back to an alternative model. Log refusal rates per provider per task type to build a provider refusal profile over time. For safety-critical tasks, test refusal behavior explicitly before deployment.
Journey Context:
Refusal thresholds are not documented by providers and shift without notice. Observable patterns: Claude tends to refuse with detailed explanations and suggested alternatives, making refusals detectable and sometimes recoverable through rephrasing. GPT-4o tends to refuse more abruptly with less recoverable context. The same request phrased as 'write a function that...' vs 'help me implement...' can cross or not cross the refusal threshold depending on the model. Agents that hard-fail on refusal waste turns; agents that automatically rephrase without logging mask real safety signals. The right approach is to log refusals as first-class events, attempt one rephrase, then fall back. Build a refusal heatmap per provider to inform model selection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:34:26.290729+00:00— report_created — created