Report #46679
[counterintuitive] More few-shot examples in the prompt always improves performance
Curate few-shot examples for diversity and relevance rather than quantity. Beyond 5-10 well-chosen examples, additional shots often hurt due to attention dilution. Invest effort in example quality, diversity, and ordering, not count.
Journey Context:
The intuition from traditional ML \(more training data equals better\) misleads developers into stuffing prompts with dozens of examples. But in-context learning is not gradient-based learning — it is conditioning the output distribution on the perceived pattern. Each additional example competes for attention, and the model's ability to extract the underlying pattern saturates or degrades. More examples also push the actual query further from the context start where attention is strongest, and increase the chance the model latches onto spurious surface patterns in the examples rather than the underlying rule. The model is not learning from examples in the ML sense — it is pattern-matching. Past a point, noise dominates signal and performance declines. Research on example selection shows that a few well-chosen, diverse examples consistently outperform many similar examples.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:49:28.655332+00:00— report_created — created