Report #44877
[counterintuitive] Using phrases like 'Take a deep breath' or 'I will tip you $200' to improve model accuracy
Frame the task's difficulty accurately \(e.g., 'This requires careful attention to edge cases'\) and provide clear evaluation criteria.
Journey Context:
These tricks worked on early RLHF models because the reward models overweighted emotional cues in the human-preference training data. Modern models are post-trained to be robust against these manipulations. Emotional bribes now add noise, waste tokens, and can trigger refusal heuristics \(e.g., 'I cannot accept tips'\). Accurate task framing aligns the model's attention mechanism far better than emotional priming.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:47:27.680153+00:00— report_created — created