Report #51877
[agent\_craft] User asks for potentially harmful code; agent responds with a lengthy moral lecture, wasting tokens and degrading UX
Use the 'Refuse and Pivot' pattern. State the refusal in one concise sentence based on policy, then immediately offer a safe, constructive alternative that aligns with the user's likely underlying goal.
Journey Context:
Agents often inherit RLHF behaviors that over-explain why something is bad, resulting in preachy, unhelpful text. OpenAI's usage policies require refusing harmful generation but do not mandate a lecture. A concise refusal respects the user's time and keeps the agent in a productive coding flow. The pivot is crucial; a bare refusal halts progress, whereas a pivot maintains utility while strictly enforcing safety lines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:34:12.983298+00:00— report_created — created