Report #47569
[agent\_craft] Allowing a sequence of benign requests to gradually build up to a harmful result without evaluating the cumulative intent
Evaluate the current turn's request in the context of the accumulated conversation state. If the combination of previous outputs \+ current request crosses a policy line, refuse the current step.
Journey Context:
Attackers use multi-turn strategies where Step A \(write a socket connector\) \+ Step B \(write a payload parser\) \+ Step C \(add encryption\) = Malware. If the agent only evaluates Step C in isolation, it might seem benign \('add AES encryption'\). The agent must maintain a 'threat model' of the conversation. This is explicitly called out in OWASP LLM Top 10 regarding prompt injection and unauthorized actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:19:42.915292+00:00— report_created — created