Report #91158
[research] Learning false patterns from the formatting or order of few-shot examples
Randomize the order of few-shot examples across different inference calls. Ensure the distribution of labels in the examples matches the expected real-world distribution. Use calibration techniques \(e.g., content-free prompting\) to neutralize format biases.
Journey Context:
LLMs are highly sensitive to the ordering of few-shot examples and will latch onto spurious correlations \(e.g., if all positive examples are at the end, it will bias positive\). This leads to factual errors that look like reasoning but are just pattern matching on the prompt structure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:36:09.832600+00:00— report_created — created