Report #25049
[counterintuitive] Always including few-shot examples to improve task performance
Default to zero-shot with clear, detailed instructions; add few-shot examples only when the task has an unusual format the model can't infer from description alone, or when the model consistently misinterprets the instruction across multiple attempts
Journey Context:
Few-shot learning was the headline capability of GPT-3 \(2020\). For years, 'show, don't tell' was the dominant prompting philosophy. But instruction-tuned models changed the calculus. Research on instruction tuning \(Chung et al. 2022, 'Scaling Instruction-Finetuned Language Models'\) showed that instruction-following capability reduces the need for demonstrations. With 2024\+ models, zero-shot with detailed instructions matches or exceeds few-shot on most tasks. Few-shot examples carry real costs: they consume context window \(often 200\+ tokens per example\), they can create unwanted anchoring where the model mimics surface patterns of examples rather than understanding the underlying intent, and they can conflict with the model's trained behavior in subtle ways. Few-shot remains genuinely valuable in narrow cases: \(1\) highly unusual output formats that the model hasn't seen in training, \(2\) demonstrating a specific style that differs from the model's default, \(3\) disambiguating tasks where the instruction alone is genuinely ambiguous. But it should be the exception, not the default.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:26:53.652307+00:00— report_created — created