Localizing AI Based Services

Many companies are rolling out AI first or AI augmented services that directly or indirectly interact with users. This presents some potentially thorny issues because LLMs are dependent on the quantity and quality of training data to perform well. An overwhelming amount of that data is in English and a handful of other languages. Localizing your user interface and other customer touchpoints is pretty easy because that can all be done with human augmented AI/MT translation processes. However it doesn’t matter how good your UI and website look in Thai if the underlying service is dysfunctional in that language.

Have Native Speakers Test The User Experience

One of the reasons it is a good idea to hire bilingual staff where possible is they can evaluate the product and look for functional issues while running it in their language. One of the problems with AI is it can generate language that “sounds” right but is factually incorrect. This is something native speakers can spot that may go unnoticed by other testers (and you really don’t want to turn a broken experience loose on your users).

<aside> 👉

This is also something a transcreation agency like Mother Tongue can help with. These agencies hire copywriters who are native speakers of the target language. They can assess the quality of translations, style and accuracy of responses in their respective languages.

</aside>

Be Cautious With Language Releases (Or At Least Use The BETA Label)

It is generally not a good idea to shotgun release new languages without doing thorough testing first. Even then it is a good idea to label the release as BETA and also to compare user metrics versus English. Users are generally pretty tolerant of minor translation errors in the UI, so those generally won’t hurt usage much, but if they are using your service to do research and getting nonsensical results they may peace out and not come back.

Secondary and Low Resource Languages

Secondary and low resource languages (languages that have sparse training data) pose especially difficult problems. This is for two reasons. One there may be orders of magnitude less training data to work with, so while your product might perform like ChatGPT-4 in English, it could be a lot less reliable in Hindi. To make matters worse, a lot of the content on the open web is machine translated into these languages, which creates a garbage in → garbage out problem which makes the problem even worse if that material is used to train the model.

Use English As A Bridge Language

One approach you can use is to use a translation AI like DeepL or Google Translate LLM to translate to and from English. Some AI providers may already be doing this behind the scenes, so it is worth investigating that before you add your own translation layer, which could just get in the way. There are some risks to doing this as there will typically be some loss of information in each direction (the best models deliver accurate translations 80-90% of the time, which can make the difference between a good prompt and one that will produce garbled results). Specialist translation AIs generally outperform generalist AIs like ChatGPT for translation. If you do this, it’s probably a good idea to make this clear to users, so if they speak a better resourced language, they can use that instead.

<aside> 👉

Secondary and low resource languages are often underserved or not served at all by translation engines. Sometimes you can find specialist translation platforms that target specific languages. What you’ll typically see is that translation accuracy is not as good for secondary languages, which will cause information loss in both directions. The good news is many people understand one or more of the other top international languages where AI platforms perform well. For example, French is widely spoken in parts of Africa, so users there might find the platforms work best in French versus local languages and dialects.

</aside>

Good News : Native Multilingual LLMs Are On The Way

One encouraging development is that countries are developing their own AI models that are trained in relevant languages to build natively multilingual services. In Switzerland, ETH Zurich and EPFL are developing an open source, multilingual LLM that others can build on. I expect to see more developments like this to create AI models to serve other regions. My personal hope is that open source models will win out in the long run, because will enable countries that might otherwise be underserved to build their own models. India is a great example, with the number of people and diversity of languages spoken there.