Report #71027
[cost\_intel] Overpaying for translation on high-resource language pairs with frontier models
For high-resource language pairs \(EN↔ES/FR/DE/ZH/JA/KO/PT/RU\) and non-technical content, small models are within 2-3% of frontier quality at 10-20x lower cost. Reserve frontier for low-resource languages, technical/medical/legal content, or culturally nuanced marketing copy where mistranslation has high consequence.
Journey Context:
Translation quality is heavily dependent on language pair and domain. For common language pairs and general content, even small models have seen enormous amounts of parallel text in training and produce fluent, accurate translations. The cost savings are dramatic. But the failure modes are severe for edge cases: low-resource languages \(many African, Southeast Asian, indigenous languages\) produce literal, ungrammatical, or hallucinated translations from small models. Technical content \(medical, legal, engineering\) requires frontier models because small models miss domain-specific terminology and produce plausible but incorrect translations. The hybrid approach: route by language pair \+ domain. High-resource \+ general content → small model. Everything else → frontier. This typically routes 60-70% of volume to the cheap path while maintaining quality where it matters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:47:34.232651+00:00— report_created — created