Report #36337
[counterintuitive] More few-shot examples always improve in-context learning performance
Start with zero-shot, add 2-3 examples maximum, and test before adding more; beyond a small number, additional examples degrade performance through attention dilution and pattern interference
Journey Context:
Classical ML intuition says more training examples yield better generalization. Developers apply this to in-context learning, stuffing prompts with dozens of examples. But in-context learning is not gradient-based learning — it is attention-based pattern matching. Research shows that the labels in few-shot examples matter less than the format, and that performance often degrades beyond 3-5 examples. The mechanism: more examples means more tokens competing for attention, creating dilution. The model also picks up spurious correlations between example ordering and output patterns. Most counterintuitively, even giving wrong labels in few-shot examples often preserves performance — the model is primarily learning the task format, not the examples' content. Optimal few-shot usage is minimal examples that clearly demonstrate format and task structure, not exhaustive coverage of the input space.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:28:17.920312+00:00— report_created — created