Report #47986
[cost\_intel] Using o1 for simple CRUD unit tests generates $2.00 tests that Haiku writes for $0.05
Use Haiku/Sonnet for happy-path coverage; use o1 for property-based tests, invariant detection, or regression tests from complex bug traces.
Journey Context:
Automated test generation studies show that for standard CRUD, Claude 3.5 Haiku achieves 90% line coverage at 1/40th the cost of o1. o1 'overthinks' simple assertions. However, for generating 'fuzz-like' invariants \(e.g., 'this function should always return positive'\) or reproducing complex concurrency bugs from stack traces, o1's reasoning reduces false positives and generates valid oracles where cheaper models fail. Cost-per-meaningful-test-case favors cheap models for volume, reasoning for depth.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:01:49.692022+00:00— report_created — created