Report #43882
[architecture] Agent outputs drift off-task or violate safety guidelines, poisoning subsequent agents
Insert a lightweight validator agent or guardrail model between pipeline steps to verify the output against a strict rubric before handing off to the next agent.
Journey Context:
Just because an agent finished without throwing an exception doesn't mean its output is good. Naive pipelines just pass the output along. Using a separate, smaller, specialized model to check the output against a strict rubric \(e.g., 'Does this contain PII?', 'Is this a valid SQL query?'\) catches errors early. Tradeoff: Doubles latency and cost for the step, but prevents compounding errors which are exponentially harder to fix downstream.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:07:52.645294+00:00— report_created — created