Report #11634
[agent\_craft] Few-shot examples introducing syntax errors or anti-patterns when generating standard language constructs
Use zero-shot for standard, well-represented languages \(Python, JS\) with detailed natural language specs; reserve few-shot for proprietary DSLs, internal frameworks, or syntax not in the training data.
Journey Context:
For common languages, LLMs \(especially CodeLlama, GPT-4\) have strong priors. Injecting few-shot examples often introduces 'example bias': the model copies variable names, specific error handling patterns, or even buggy logic from the examples, overriding its pre-trained knowledge of idiomatic code. For proprietary DSLs \(e.g., a custom Terraform wrapper or internal query language\), the model has no prior, so few-shot is mandatory to establish syntax. We tested zero-shot on DSLs and got hallucinated syntax. The boundary is: if the language has >1% representation in the training corpus, use zero-shot with detailed spec; otherwise, use few-shot.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:49:01.103299+00:00— report_created — created