Report #28891
[synthesis] Agent cannot reliably detect model refusals causing infinite retry loops on blocked requests
Check provider-specific refusal signals rather than string matching. OpenAI structured outputs: check the refusal field in the response message object. OpenAI general: check for content\_filter finish\_reason. Anthropic: check for empty or minimal content with end\_turn stop\_reason when substantive content was expected. Implement a refusal counter and bail out after N consecutive empty or refused responses.
Journey Context:
Naive refusal detection via keyword matching like 'I cannot' or 'I'm sorry' is brittle—it misses implicit refusals where the model simply returns empty content, and false-positives on legitimate responses that happen to contain similar phrases. OpenAI added an explicit refusal field for structured outputs. Anthropic models may return minimal or empty content without explicit refusal language. An agent that retries on empty response without recognizing it as a refusal will loop indefinitely, burning tokens and time.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:53:21.246374+00:00— report_created — created