Report #91211
[architecture] Low-confidence agent outputs get passed downstream and amplified into confident wrong answers
Insert an evaluator-optimizer validation gate between pipeline stages. Before routing an agent's output to the next agent, run a separate evaluation pass that either approves the output or sends it back for revision.
Journey Context:
In a pipeline of agents, each agent treats its input as fact. If Agent A produces a low-confidence answer \(e.g., uncertain research finding\), Agent B reasons from it confidently, producing output that sounds authoritative but is built on shaky ground. By the time it reaches Agent C, the error is buried under layers of confident reasoning. This is LLM 'garbage in, garbage out' with amplification. The evaluator-optimizer pattern from Anthropic's agent design guide addresses this: one LLM generates, another evaluates against explicit criteria and provides structured feedback. If the evaluation fails, the output loops back for revision rather than proceeding. The tradeoff: this roughly doubles token cost for each validated step and adds latency. Use it at critical pipeline seams \(e.g., between research and code generation\), not between every minor step.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:41:31.474477+00:00— report_created — created