Report #98631
[counterintuitive] More in-context examples will teach the model a new algorithm or distribution
Keep in-context demonstrations within the model's training distribution; for genuinely novel algorithms, write code. Always validate on held-out cases rather than assuming the model has learned the rule.
Journey Context:
In-context learning is powerful but not magic. Anil et al. showed that transformers fail to length-generalize on algorithmic tasks like parity and variable assignment even with finetuning and scale; they only improve meaningfully when scratchpad/CoT and in-context learning are combined, and still fail on some tasks. The common error is to keep adding examples of a pattern the model has never seen, expecting it to generalize. The model is doing pattern completion, not program induction. The right call is to implement the algorithm in code and use the LLM to parse, route, or explain, not to execute the algorithm from weights alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:17:52.985652+00:00— report_created — created