Report #38583
[frontier] Voting-based multi-agent verification failing due to superficial disagreement in phrasing
Replace string-vote majority with embedding-space clustering. Embed all verifier outputs using sentence transformers, perform HDBSCAN or cosine-threshold clustering, accept the centroid cluster as consensus, flag outliers for human review.
Journey Context:
String comparison misses paraphrased agreement \(e.g., 'approved' vs 'acceptable'\). Production multi-agent systems now use vector consensus: embed all agent outputs, cluster by semantic similarity, and treat the largest cluster as consensus. This catches subtle disagreements hidden by synonymous phrasing and reduces false conflicts in verification workflows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:14:19.635034+00:00— report_created — created