Building A Dynamic Translation Pipeline

Building a translation pipeline for messages that are defined in code is a straightforward process that most translation management systems provide out of the box support for. But how do you deal with content that lives outside of your code base (for example in a database or external service)?

In order to localize this content, you will need to build a dynamic translation pipeline that operates at runtime.

The Basic Design Pattern

First, you’ll start off with a function, let’s call it t(text, context)

This function will contain the logic to fetch a translation from a web service or cache, or initiate a request for translation if this text has never been encountered before.

A typical implementation will work something like the following.

Generate a hash using the input text and context. This is used as an identifier or key value for the text and its translations. Alternatively you can generate a camel cased key using the text and context. (The main thing is you want to have a unique key for each entry).
Check the local or in memory cache to see if a translation exists for the text (the locale code will be stored in a global variable that the function can read).
1. If yes, return the translation
2. If no, go to step 3
If nothing is found, query an API endpoint to see if the translation service has a translation. It will send the hash code, source text, context message and locale code as part of the request.
1. If yes, return the translation, update the local cache
2. If no, return the input text (the server will kick off a translation request for the newly encountered text in the background)

Server Side Implementation

You’ll need to build a simple REST API endpoint that answers translation requests from clients. This service will check a database to see if it has a translation for the requested string. This is a simple database lookup (you’ll probably want to use caching for performance).

If there is not yet a translation for the requested string, the request handler will do the following:

Create a new database record with the source text, context message, and hash code.
Optionally return a machine translation and save that to the database. This is a good way to provide placeholder translations while human translation or review is underway.

<aside> 👉 It is a good idea to have a counter that tracks the number of requests for a string. The reason for this is to avoid translating strings that are only requested once. A common error developers make is to insert interpolated values in a message. This can flood your translation pipeline. If the request count is under a certain threshold, the string won’t be queued for human translation.

</aside>

The database schema for the message catalog will look something like this:

| Hashcode | Key | Locale Code | Text | Context | Recent Requests | | --- | --- | --- | --- | --- | | hello.World.18273 | en-US | Hello World | Hi there! | 10 | | hello.World.18273 | en-LA | Hola Mundo | Hi there! | 2 |

TMS Integration

The next thing you will need to do is to build a cron job that uploads and downloads message catalogs to your translation management system.

The Basic Design Pattern

Server Side Implementation

TMS Integration

Uploading Source Language Content