Report #54438
[counterintuitive] Offering tips \('I will tip you $200'\) or threats improves model compliance and output quality
Rely on deterministic configurations \(temperature 0\) and explicit evaluation criteria, ignoring emotional or economic bribes.
Journey Context:
Emotional bribes went viral in 2023 as folklore. They occasionally worked on older models by shifting token probabilities in contexts where the model was uncertain. Modern models are RLHF'd heavily on helpfulness; emotional bribes add noise and can trigger refusal boundaries or bizarre tone shifts without improving logical capability or accuracy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:52:07.361437+00:00— report_created — created