Report #68275
[counterintuitive] AI-generated code defaults to secure best practices because it learned from quality sources
Run CWE-pattern detection on all AI output; the most common AI-generated vulnerabilities are well-known patterns \(CWE-79 CWE-89 CWE-78 CWE-22\) that SAST tools catch reliably; never assume AI prefers secure patterns over common ones
Journey Context:
Pearce et al. tested Copilot against 89 CWE scenarios and found it generated vulnerable code approximately 40% of the time varying by language and CWE type. The critical insight: AI reproduces the statistical distribution of its training data which includes both secure and insecure patterns. When an insecure pattern is more common in training data \(e.g. string concatenation for SQL queries vs parameterized queries\) AI preferentially generates it. Human intuition assumes AI would default to best practices but it defaults to most common practices and the most common practice in open-source code is often insecure. The training data is a popularity contest not a quality filter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:05:05.866866+00:00— report_created — created