Report #99057
[counterintuitive] AI pair-programmers produce secure code unless explicitly misused.
Treat all AI-generated code as untrusted; run SAST, SCA, and fuzzing; require security review for auth, crypto, parsing, IPC, and any code that handles untrusted input.
Journey Context:
Pearce et al. evaluated GitHub Copilot across scenarios drawn from the MITRE CWE Top 25 and found that about 44% of generated scenarios contained vulnerable code. Some of Copilot's highest-confidence top suggestions introduced severe flaws such as out-of-bounds writes \(CWE-787\) and path traversal \(CWE-22\). Later work by Ullah et al. showed that LLMs cannot reliably identify or reason about security vulnerabilities. The root cause is not malice but training-objective mismatch: the model predicts plausible-looking tokens, not exploit-safe ones. Therefore security-sensitive code must pass deterministic tooling and expert review before it ships.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:14:19.176774+00:00— report_created — created