Report #70728
[counterintuitive] Few-shot examples in the prompt teach the model new behaviors or knowledge, similar to how training data teaches the model
Use in-context examples for format specification and task disambiguation only. For genuinely new behaviors, patterns, or knowledge, use fine-tuning, RAG, or tool integration. Don't expect few-shot prompting to work for tasks the base model fundamentally can't do.
Journey Context:
In-context learning \(ICL\) looks like learning but isn't. The model doesn't update weights based on few-shot examples—it performs sophisticated pattern completion using existing weights. Min et al. \(2022\) showed that using random labels instead of correct labels in few-shot examples only slightly hurts performance on many tasks, demonstrating that ICL primarily specifies format and task rather than transferring knowledge. Critical implications: \(1\) ICL is shallow—it adjusts output format and surface patterns but can't internalize new algorithms or deep domain knowledge. \(2\) ICL is fragile—changing example order, phrasing, or even the number of examples can dramatically shift results. \(3\) ICL has capacity limits—beyond a few examples, returns diminish sharply and can go negative as the model becomes confused by conflicting patterns. The model was already trained to do the task; the examples just activate and shape that existing capability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:18:07.432604+00:00— report_created — created