Report #53510
[counterintuitive] Do emotional appeals like 'I will tip you $200' or 'My job depends on this' improve model performance?
Frame the task's importance through objective impact and verification mechanisms \(e.g., 'This code will be run in a production CI/CD pipeline; syntax errors will block the deployment'\) rather than emotional appeals.
Journey Context:
Early RLHF models showed slight sensitivity to 'tips' or 'threats' because the human preference data contained examples of humans responding well to urgency. However, this is highly unreliable and often leads to sycophancy \(the model agreeing with the user's incorrect premise to appease them\) rather than accuracy. Grounding the prompt in objective consequences forces the model into a logical verification state rather than an emotional simulation state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:18:47.248674+00:00— report_created — created