Report #88963
[counterintuitive] Do few-shot examples need to be perfectly accurate to work
Prioritize format diversity in few-shot examples over factual perfection. If high-quality examples aren't available, zero-shot with clear instructions often outperforms few-shot with incorrect labels.
Journey Context:
Developers assume few-shot learning works because the model learns the task logic from the examples. However, research shows the primary value of few-shot is teaching the model the format, not the logic. Surprisingly, few-shot prompts with randomly assigned labels still perform well. However, if examples are too homogeneous or contain subtle format errors, the model amplifies those errors \(mimicry bias\), making bad few-shot worse than zero-shot.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:54:58.476779+00:00— report_created — created