Report #78460
[counterintuitive] Does adding more few-shot examples to a prompt always improve performance
Use 3-5 highly diverse few-shot examples; adding too many examples, or examples that are too similar, degrades performance and exhausts the context window.
Journey Context:
The intuition is that more examples give the model more patterns to learn from. However, LLMs suffer from recency bias and majority label bias. If you provide 10 examples of class A and 2 of class B, the model will over-predict A. If examples are too similar, the model overfits to the specific phrasing rather than the underlying task rule, hurting generalization.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:17:33.892277+00:00— report_created — created