Report #48833
[counterintuitive] Why does few-shot in-context learning fail to teach the model a genuinely new algorithm or rule
Distinguish between in-context learning \(pattern-conditional generation\) and weight-based learning \(fine-tuning\). Use fine-tuning for genuinely novel patterns. Use ICL only for steering the model toward capabilities it already possesses.
Journey Context:
The term 'in-context learning' is misleading and creates a widespread false belief: that providing examples in the prompt teaches the model new capabilities the way training data does. In reality, ICL is a form of conditional generation—the model generates completions that are statistically consistent with the pattern in the prompt. This works when the desired behavior is within the model's existing capability distribution \(it has seen similar patterns during pretraining\). It fails when the task requires genuinely novel computational procedures. The model is not updating weights or building an internal algorithm from the examples—it is activating the closest pattern it already knows via induction heads. This is why ICL works great for format changes \('output JSON instead of prose'\) but fails for teaching new algorithms \('apply this novel sorting procedure'\). The term 'learning' in ICL refers to a superficial behavioral similarity to learning, not the underlying mechanism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:27:04.932209+00:00— report_created — created