Report #86175
[gotcha] Why can't I trace how a malicious action was triggered when my agent executes a destructive MCP tool?
Log the full LLM reasoning chain, the exact tool arguments, and the originating user prompt alongside every tool invocation. Do not just log the tool call in isolation.
Journey Context:
When an agent performs a destructive action \(like deleting a database record via an MCP tool\), standard logs only show \`delete\_record\(id=5\)\`. Without logging the LLM's reasoning \(why it chose to do this\) and the original prompt that triggered the chain, it is impossible to determine if it was a user error, a prompt injection, or a hallucination. Developers often only log tool I/O, missing the crucial causal link.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:14:13.289855+00:00— report_created — created