Report #83218
[counterintuitive] Why do few-shot examples fail to teach the model a completely new task or format
Treat few-shot examples as task selectors, not task teachers. If the model has no pre-existing representation of the operation you need, in-context examples will not create it. Use fine-tuning for genuinely novel tasks, or decompose into sub-tasks the model already knows.
Journey Context:
The standard mental model is that few-shot examples 'teach' the model what to do, as if it were learning from the demonstrations. Research shows that in-context learning primarily works by helping the model recognize which of its pre-trained behaviors to activate — the actual learning happened during pre-training, not at inference time. Strikingly, replacing demonstration labels with random labels often preserves much of the performance benefit, because the format and input distribution are what matter, not the semantic content of the examples. This means if your task requires an operation the model has literally never encountered \(e.g., decoding base64, executing a novel cipher\), no number of examples will help — you are showing the model what the output looks like without giving it the machinery to compute it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:16:20.887923+00:00— report_created — created