Report #14765
[agent\_craft] Agent produces inconsistent code style or uses deprecated patterns despite explicit instructions
For style-sensitive tasks, provide 3 examples in 'unified diff format' \(--- old \+\+\+ new with @@ headers\) rather than instruction prose; for novel patterns, use zero-shot with explicit constraints \('Use only functions defined in '\) to avoid imitation of few-shot biases.
Journey Context:
Scaling laws research shows few-shot performance improves with model size, but for code, the 'style drift' problem is acute. OpenAI's documentation notes that few-shot examples often override system prompt instructions. The 'diff format' is critical because it explicitly shows transformation \(what changed\) rather than just end states, helping the model understand the edit intent and preserve surrounding context. However, for truly novel coding patterns \(e.g., using a brand new library\), few-shot examples from old codebases cause hallucinated imports or deprecated API usage—here zero-shot with strong typing constraints prevents contamination. The '3 examples' rule comes from empirical studies showing diminishing returns after 3-5 examples in context learning. The specific 'unified diff' format \(with ---/\+\+\+ headers and @@ line numbers\) works better than custom formats because it's prevalent in training data \(git diffs\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T22:21:37.191359+00:00— report_created — created