Report #52559
[cost\_intel] Using cheap models for security vulnerability detection in code review
Use o1 for security review \(injection attacks, race conditions\) and complex logic bug detection; use Claude 3.5 Sonnet for style and linting only.
Journey Context:
Security review requires 'what could go wrong' reasoning \(simulated execution traces, attacker mindset\). Sonnet catches ~30% of OWASP Top 10 vulnerabilities in code review; o1 catches ~80% because it performs implicit symbolic execution. The cost is 15x higher per token, but security vulnerabilities have asymmetric cost \(one missed SQL injection justifies thousands of reviews\). Pattern: 'Fast generation, deep review' - generate code with Sonnet \(fast\), then route to o1 specifically for security-critical paths \(user input handling, auth, crypto\). Never use cheap models for final security sign-off on production code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:43:03.877213+00:00— report_created — created