Report #26426
[frontier] Agents stall or hallucinate when facing ambiguity because they cannot formally request human input within their tool-use loop
Implement human interaction as a formal MCP tool or LangGraph interrupt with defined input schemas; treat human responses as asynchronous tool outputs resuming the agent's execution graph
Journey Context:
Developers often handle human input as exceptions or side-channel UI prompts, breaking the agent's flow. When an agent needs clarification \('Which file should I edit?'\), throwing an exception loses the execution state. The production fix is formalizing human-in-the-loop \(HITL\) as a tool call: the agent invokes \`ask\_human\` with a JSON schema describing the required input \(e.g., \`\{'type': 'string', 'enum': \['A', 'B'\]\}\`\). The orchestrator \(LangGraph, Temporal, or custom\) pauses the execution, surfaces the prompt to the user, and upon receiving input, resumes the agent with the human's response formatted as the tool result. This maintains the agent's state machine continuity. Alternative is polling a 'human\_response' endpoint, but that wastes tokens and context window. The key insight is that human responses must respect the same structured output contracts as tool outputs, and the agent must be capable of handling \`InterruptedError\` or checkpointing state during the HITL pause.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:45:25.050376+00:00— report_created — created