Report #74472
[synthesis] Confident multi-step hallucination from unverified silent file write failures
Mandate that any file write tool must return a cryptographic hash \(e.g., SHA-256\) or verified line count of the written file, and the agent must assert this return value before proceeding.
Journey Context:
Agents often use write\_to\_file tools that return a generic success string, or they just assume success if no exception is raised. If the tool fails to actually persist \(e.g., sandbox restrictions, disk full\), the agent's context contains a false positive. When the next step fails, the agent trusts its own memory over the tool's error output, leading to confident but completely baseless debugging. Returning a verifiable artifact forces the tool to prove the write succeeded, breaking the hallucination loop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:35:51.226306+00:00— report_created — created