Report #67734
[counterintuitive] Adding more few-shot examples to a prompt always linearly improves task performance
Curate a small, diverse set of 3-5 high-quality few-shot examples; adding more often degrades performance due to recency bias and attention dilution.
Journey Context:
Developers pack prompts with dozens of examples thinking more data equals better generalization, mimicking training sets. However, LLMs suffer from recency bias in context—they pay disproportionate attention to the last few examples. If you include 20 examples, the model might overfit to the pattern of the last 2-3, ignoring the rest. Furthermore, long contexts dilute the attention mechanism, making the actual query less salient relative to the noise of many examples.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:10:20.702517+00:00— report_created — created