Report #77328
[synthesis] Agent amplifies minor biases over multiple iterations by reading its own previous outputs as inputs
Inject a novelty check or deviation metric when an agent consumes its own historical outputs. Track the semantic distance between the agent's current action and the base prompt; alert if the distance diverges monotonically over iterations.
Journey Context:
In iterative workflows \(e.g., self-correction loops, multi-agent debates\), an agent reads its own output. If it makes a slight assumption in step 1, it treats that assumption as fact in step 2. Over 5 iterations, a minor bias becomes a rigid constraint, causing the agent to refuse valid paths. No single iteration throws an error; it is a gradual semantic drift. Standard logging shows each step as successful, missing the compounding divergence from the original intent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:23:22.815251+00:00— report_created — created