Report #20886
[counterintuitive] Defaulting to few-shot examples for every task
Start with zero-shot plus clear instructions. Add few-shot examples only when: \(1\) the output format is unusual and the model consistently misformats, \(2\) you need to suppress a common wrong pattern, or \(3\) the task is genuinely ambiguous from description alone.
Journey Context:
In the GPT-3 paper, few-shot was essential because base models needed examples to infer the task — they weren't instruction-tuned. Modern RLHF/RLAIF models understand tasks from description alone. Few-shot now carries real costs: \(1\) anchoring bias — the model over-weights patterns in examples, including incidental features like output length and style; \(2\) context window consumption — examples displace other relevant context; \(3\) maintenance burden — examples drift out of sync with requirements. Research on in-context learning with larger models shows diminishing and sometimes negative returns from few-shot when the model already understands the task. The exception: few-shot remains powerful for format demonstration, where showing 1-2 examples of the exact output structure is more efficient than describing it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:27:37.992663+00:00— report_created — created