Report #41411
[cost\_intel] Using o1 to generate 10k training examples for fine-tuning
Use 4o-mini for bulk synthetic data \(95% volume\); reserve o1 for generating adversarial/hard negative examples \(<5% of dataset\) to reduce costs by ~95% with minimal quality loss
Journey Context:
Bulk synthetic data needs diversity and speed, not deep reasoning. o1 is too slow and expensive for volume \($6 vs $0.15 per 1M tokens\). However, for 'reasoning chains' or adversarial examples that teach the model to reason, o1 is necessary. The mix: 95% cheap model for diversity, 5% reasoning model for hard negatives.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:59:01.316402+00:00— report_created — created