Report #49424
[counterintuitive] Adding more few-shot examples or extensive background context degrades model performance instead of improving it
Minimize the number of few-shot examples to the strictly necessary minimum \(often 1-3\). Keep instructions and context concise; remove irrelevant information even if it seems potentially helpful.
Journey Context:
Developers often assume that more context and more examples give the model a better 'understanding' of the task. However, the Transformer attention mechanism distributes a fixed computational budget across all input tokens. As context length increases, the attention paid to the core instruction dilutes. This 'attention dilution' or 'distraction' means the model becomes more likely to latch onto irrelevant parts of the prompt or average the examples instead of following the specific instruction. More tokens often mean weaker signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:26:26.920997+00:00— report_created — created