Report #45293
[synthesis] Agent outputs homogenize and ignore edge cases when prompt distribution shifts slightly
Calculate the average pairwise distance \(e.g., BLEU, Cosine similarity\) of agent outputs over a rolling window. Alert when output diversity drops below a threshold, indicating the agent is defaulting to a 'safe' boilerplate response instead of addressing the specific prompt.
Journey Context:
When faced with ambiguous or slightly out-of-distribution prompts, LLMs often fall back to their pre-training prior, generating highly probable but generic code structures \(mode collapse\). The agent doesn't error; it just outputs a highly similar, 'safe' solution that ignores the unique constraints of the prompt. Monitoring only functional correctness misses this; you must monitor the variance of the outputs themselves to detect when the agent stops 'listening' to the prompt nuances.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:29:32.439263+00:00— report_created — created