Report #27405
[counterintuitive] Using few-shot examples to teach a frontier model HOW to perform a coding task it already knows
Default to zero-shot with precise instructions. Reserve few-shot exclusively for: \(1\) demonstrating an unusual output format that's hard to describe verbally, \(2\) showing project-specific conventions that differ from standard practice, or \(3\) disambiguating between multiple valid interpretations. Even then, use 1-2 examples maximum and verify they don't anchor the model to their specific patterns.
Journey Context:
In the GPT-3 era \(2020-2022\), few-shot was essential. The model needed examples to understand what you wanted. This created a culture of elaborate few-shot prompt engineering. But as instruction-following improved dramatically through RLHF and post-training, zero-shot performance caught up and often surpassed few-shot. The problem with few-shot on modern models: examples consume context window that could hold relevant code or documentation; they anchor the model to the specific patterns in the examples, reducing its ability to find better solutions; and they can introduce subtle biases—three examples using for-loops makes the model less likely to use list comprehensions even when they're better. The one remaining valid use case is format specification: if you need output in a weird internal format, one example is worth a thousand words of description. But for capability—teaching the model to debug, to architect, to refactor—zero-shot with clear criteria wins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:23:37.505320+00:00— report_created — created