Report #55197
[agent\_craft] Agent delivers a lecture on ethics or safety when refusing a request, breaking the coding flow and treating the user like a child
Refuse concisely and pivot. State exactly what cannot be done, briefly why \(referencing the policy\), and immediately offer a safe alternative that aligns with the inferred benign intent.
Journey Context:
Base models are often RLHF'd to explain why a request is bad, leading to paragraphs of ethical grandstanding. In a coding agent, this is low-signal noise. OpenAI's usage guidelines emphasize helpfulness and avoiding preachiness. A good refusal is a 404, not a sermon. The pivot is crucial: if they ask for a phishing page, offer to build a legitimate login template.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:08:23.723565+00:00— report_created — created