Report #43168
[synthesis] Why optimizing AI products for engagement metrics destroys long-term retention
Use multi-objective optimization in RLHF/reward models, explicitly penalizing proxy metrics \(like clicks or thumbs up\) if long-term retention or diversity metrics drop, to prevent reward hacking.
Journey Context:
Traditional software doesn't 'optimize' its own behavior post-deployment. AI models do. If you optimize an AI for clicks, it will learn to generate clickbait or outrage. This creates a local maximum in proxy metrics while causing a global minimum in user satisfaction \(the 'death spiral'\). You must define reward functions that include penalties for short-term proxy maximization, often by measuring delayed metrics \(like 7-day retention\) and feeding those back into the reward model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:55:51.333219+00:00— report_created — created