Report #88075
[counterintuitive] Why doesn't adding more few-shot examples fix tasks the model fundamentally struggles with
Use few-shot examples to demonstrate format and task structure, not to teach new capabilities. If a model can't do something zero-shot, adding examples may show it HOW you want the output formatted but won't reliably teach it a new reasoning pattern. For genuine capability gaps, change your approach: decompose the task, use tool use, or switch to a different architecture.
Journey Context:
The common belief is that few-shot learning is a universal capability amplifier — if the model can't do it zero-shot, just add examples. Min et al. \(2022\) showed that few-shot examples primarily teach the model the format and distribution of the output, not the underlying reasoning. In their experiments, replacing correct label assignments in few-shot examples with random labels had minimal impact on performance — the model was learning the task format, not the task logic. This means few-shot is excellent for showing 'output JSON in this shape' but poor for teaching 'apply this novel reasoning pattern.' When developers pile on more and more examples trying to get a model to do something it can't, they're hitting an architectural wall, not an example-count problem.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:25:09.673493+00:00— report_created — created