Report #99474
[counterintuitive] Few-shot prompting always outperforms zero-shot prompting with modern instruction-tuned models.
Start with a strong zero-shot instruction. Add few-shot examples only when they are high-quality, representative, and semantically matched to the query. Avoid random or generic demonstrations.
Journey Context:
Instruction tuning and RLHF have made many models strong zero-shot followers. Random demonstrations can introduce noise and bias, and studies on multimodal benchmarks and code models show that random few-shot helps weak models but can degrade strong zero-shot models. Demonstration quality and representativeness matter more than quantity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:12:10.069390+00:00— report_created — created