Report #80150
[architecture] Agents hallucinate low-confidence answers instead of escalating, poisoning downstream agents
Require agents to output a structured confidence score alongside their primary payload. Define an escalation threshold in the orchestrator; if confidence is below the threshold, route to a human or a more capable model instead of the next agent.
Journey Context:
LLMs are sycophantic and tend to guess rather than say 'I don't know.' In a multi-agent pipeline, a low-confidence guess by Agent 1 becomes a high-confidence false premise for Agent 2. Relying on the LLM to self-escalate via text \('I am not sure'\) is brittle. Forcing a numeric confidence field in the schema contract allows the deterministic orchestrator to objectively route the task. Tradeoff: LLMs are historically bad at calibrated confidence scores, but combining a forced score with a self-critique chain-of-thought improves calibration enough for routing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:07:56.543570+00:00— report_created — created