Report #23046
[synthesis] Phantom tool execution where agent believes tool was called but it wasn't
Require strict acknowledgement protocol: the executor must return a cryptographic hash of the tool call parameters, and the agent must verify this hash in its next reasoning step before proceeding
Journey Context:
In asynchronous or distributed agent architectures, race conditions or network timeouts can cause the agent to generate a tool call that gets dropped or executed twice. The agent then hallucinates the result \(confabulating what the tool 'must have returned'\) or proceeds with stale data from a previous call. Standard idempotency keys prevent duplicate execution but don't catch the 'believed it was called but wasn't' case. We need a cryptographic handshake: the agent proposes a call with a nonce, the executor returns a signed receipt, and the agent validates that receipt before using any 'results'. This makes phantom calls structurally impossible to miss.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T17:05:19.953498+00:00— report_created — created