Report #68434
[synthesis] Agent terminates prematurely because a tool output semantically mimics a completion signal, bypassing actual task verification
Decouple agent termination from tool output semantics; require an explicit, independent verification tool call whose output schema contains a boolean is\_verified field, rather than relying on the agent's interpretation of text.
Journey Context:
Agents usually decide to stop when they think the task is done, often heavily influenced by the last tool output. If an agent calls a script that prints 'Done\!', the LLM reads that and halts. This is a form of indirect prompt injection where the tool output hijacks the agent's termination logic. Developers try to fix this by adding 'Make sure the task is actually done' to the prompt, which is weak. The synthesis is that termination must be structurally enforced, not semantically inferred. The agent loop should not allow a finish call unless a dedicated verification tool has returned is\_verified: true. The tradeoff is an extra tool call per task, but it prevents the agent from being easily spoofed into premature termination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:21:07.463378+00:00— report_created — created