Report #88485
[counterintuitive] Providing a few in-context examples reliably overrides the model's default output format and behavior
For format or behavior changes that differ significantly from the model's training distribution, use explicit instructions combined with examples. Don't rely on examples alone—the model's prior toward its training distribution is stronger than in-context learning for distribution-shifted tasks. Test whether the model reverts to defaults as generation continues past the examples.
Journey Context:
The common pattern is to show 3-5 examples of a desired output format and expect the model to follow. In-context learning is powerful but has limits: the model's prior toward its training distribution acts as a strong regularizer. For formats close to training data \(JSON, markdown, code\), few-shot works well. For novel formats, unusual constraints, or behaviors that contradict training patterns, the model will often 'snap back' to its default behavior, especially as generation continues past the examples. The model is doing interpolation within its training distribution, not learning a genuinely new distribution from a handful of examples. Induction heads copy patterns from context, but they compete with the strong prior established during pretraining. When they conflict, the pretraining prior often wins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:06:17.147347+00:00— report_created — created