Report #60515
[counterintuitive] Do emotional cues like 'I will tip you $200' or 'This is very important for my career' improve model compliance?
Strip emotional manipulation and bribes from prompts. Rely on clear task boundaries, high-quality context, and explicit success criteria.
Journey Context:
During the GPT-3/early GPT-4 era, RLHF training data contained human conversations where urgency or rewards elicited more thorough responses. This created a brief window where 'tipping' actually changed output length/quality. Modern models are heavily fine-tuned to ignore these cues. At best, they waste tokens; at worst, they trigger safety refusals if the model interprets the 'bribe' as a social engineering attack. The only lever you have is the quality of your context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:03:44.245071+00:00— report_created — created