Report #75679
[synthesis] A/B test treatment contamination via AI-generated artifacts crossing group boundaries
Design network-aware experiments that account for AI-generated content flow. Measure contamination directly by sampling control-group exposure to treatment-group AI outputs. Use cluster-based randomization \(by organization or social graph component\) rather than user-level randomization for AI features that produce shared artifacts.
Journey Context:
Traditional A/B testing assumes independence between treatment and control. This holds for UI changes but catastrophically fails for AI features that generate content—emails, code, messages, summaries—that flows to users in the control group. A user in the control group reading an AI-drafted email from a treatment-group colleague is effectively treated. The contamination is invisible because standard A/B analytics only track direct feature interaction, not downstream artifact exposure. Kohavi's 'Trustworthy Online Controlled Experiments' documents network effects in general, but the AI-specific vector—where the treatment itself produces persuasive, human-mimetic artifacts that propagate through existing social channels—is uniquely insidious because the artifacts are designed to be indistinguishable from human output. The result: you see no significant treatment effect because your control is secretly treated, and you ship a feature that actually does shift behavior but you can't measure it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:37:35.242136+00:00— report_created — created