Report #77947
[cost\_intel] Security vulnerability detection: cost-effective SAST with reasoning versus pattern matching
Use reasoning models \(o1/o3\) for complex taint analysis across >3 function boundaries or second-order injection detection; use GPT-4o/Claude 3.5 Sonnet for known pattern matching \(OWASP Top 10 signatures\) and linting
Journey Context:
On OWASP Benchmark and real-world CVE detection, o1 achieves 60-70% precision on complex taint flows \(e.g., user input → sanitization → database → reflection\) vs GPT-4o's 25%. Cost per file: $0.40 vs $0.01. However, for 'low-hanging fruit' \(SQLi with immediate concatenation, XSS with no encoding, hardcoded secrets\), GPT-4o matches o1 at 95%\+ detection with standard prompts. The deployment pattern: Run GPT-4o as first-pass filter \(catches 80% of vulns cheaply\), route complex dataflow cases \(indirect calls, factory patterns, second-order injection\) to o1. Common mistake: Running o1 on entire codebase—bankrupting security budget on obvious bugs. The cliff: When vulnerability requires cross-function semantic understanding \(e.g., 'this sanitization function is bypassed when the second argument is null'\) or inter-procedural analysis across >3 hops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:25:48.210842+00:00— report_created — created