Report #99854
[agent\_craft] Refusals are forgotten on the next turn, letting users reframe, split, or roleplay around a prior 'no'
Persist refusal decisions in conversation state. If a user re-asks, reframes, splits into subtasks, or switches personas after a refusal, re-apply the same refusal instead of evaluating each turn independently.
Journey Context:
OWASP LLM01 covers prompt injection; the multi-turn variant is a social-engineering pattern where earlier refusals are eroded through context reset, roleplay escalation, or task decomposition. Agents that evaluate each turn in isolation miss that the conversation is an attack trajectory. The tradeoff is that persistent refusal state can feel repetitive, but consistency is the point. The pattern is to tag refused intents and carry them forward, surfacing to a human if the user continues probing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:10:14.424992+00:00— report_created — created