This post is part of the Ten Golden Rules of Global Content Strategy series.

Ah translation memory…That fabulous database that makes your translations faster, cheaper, and more consistent. I really like the concept of translation memory – it is very elegant and when done well, so helpful.

A quick review of the definition from our friends at Wikipedia:

translation memory, or TM, is a database that stores so-called “segments”, which can be sentences, paragraphs or sentence-likeunits (headings, titles or elements in a list) that have previously been translated, in order to aid human translators. The translation memory stores the source text and its corresponding translation in language pairs called “translation units”.

When your content is translated, the translation units are automatically created and stored in your TM. When you pay to have your content translated, you only pay to translate the new words and segments. If you say the same thing, the same way, every time you say it, then you pay for the translation just one time per language. If, on the other hand, you get all creative and say the same thing in as many ways as possible, well, it gets pricey.

The elegance of translation memory is really fabulous conceptually. Makes perfect sense. We should all be getting on the bandwagon. And most companies do. Most companies that translate have a TM that is stored by their translation vendor (usually called a Language Service Provider, or LSP). In fact, companies love TMs so much, that each LSP has its very own version of the TM.

…Wait? What did you say?…

You heard me right. If your company works with more than one LSP, it is extremely likely (almost guaranteed) that each LSP has its own version of your TM.

How did that happen?

Let’s say you send some content to “LSP A” for French, Italian, German, and Spanish (otherwise known as FIGS) translation. During the translation process, your TM is going to be populated with translation units for the source English and each language.

For example, your database might look like this:

English French Italian German Spanish
What is for dinner? Ce qui est pour le dîner? Che cosa è per la cena? Was ist für Abendessen? ¿Qué es para cenar?
We are eating fish Nous mangeons du poisson Stiamo mangiando pesce Wir essen Fisch Estamos comiendo pescado

The next time you, or any other writer, sends these same segments to this same LSP, you don’t have to pay a thing, because they are already translated and safely ensconced in your TM.

Great, you say! And it is great.

Let’s say that you decide to have those same segments translated by “LSP B”. Why might you do this? Most larger companies use at least 2 (and sometimes way more, even up to as many as 8) different LSPs. I’m not 100% sure why they do this. I think that the reasons include redundancy, so they are not reliant on any one vendor, and perhaps the ability to put pricing pressure on the vendors when they compete for projects. Another reason is specific language expertise.

But I digress. So, LSP B receives the content, translates it into a few Asian languages, and ends up with this in their TM:

English Simplified Chinese Japanese Korean
What is for dinner? 晚餐是什么? 夕食は何ですか? 저녁 식사를 위해 무엇입니까?
We are eating fish 我们吃的鱼 我々は魚を食べている 우리는 물고기를 먹고

Still great, you say! And it is.

But not really. Because now you have the translations for the same segments stored in two completely separate databases, controlled by two completely different companies. Maybe having 5 segments in one database and 4 in another one isn’t a problem at first. But imagine how problematic this can become as you translate more content into more languages.

But wait. It gets worse.

What if someone in a different department decides to use LSP A for Chinese, Japanese, and Korean translation? And what if their source content contains the same segments? Well, now the TM for LSP A looks like this:

English French Italian German Spanish Simplified Chinese Japanese Korean
What is for dinner? Ce qui est pour le dîner? Che cosa è per la cena? Was ist für Abendessen? ¿Qué es para cenar? 晚餐是什么? 夕食は何ですか? 저녁 식사를 위해 무엇입니까?
We are eating fish Nous mangeons du poisson Stiamo mangiando pesce Wir essen Fisch Estamos comiendo pescado 我们吃的鱼 我々は魚を食べている 우리는 물고기를 먹고

Oh, this is great! You say. Everything is now in one place. Problem solved.

Except that you paid LSP A to translate the segments that you already paid LSP B to translate. You have, in effect, doubled the amount you paid for the Chinese, Japanese, and Korean translations.

If you work with more than one LSP, there is a very high probability that you have paid to have the same segments translated by each LSP into the same languages.

Adding insult to injury, it is quite possible that you will end up with two different French translations (for example) for the same English source.

At the end of the day, what you have on your hands is a big, expensive, confusing mess of multiple databases each containing its own unique version of the same content. Ack! And the more LSPs you have and the more translation you do, the worse the problem is.

There are two solutions to the TM nightmare. One – which I call the TMX dance – is crude and difficult. The other – which I call centralized location TM – is elegant and relatively easy.

The TMX Dance

The TMX dance involves exporting the TM from one LSP and having the other LSPs import it. Most translation memory files can be exported to a standard file type called TMX (Translation Memory eXchange). So, you can have LSP A export its TM and provide the TMX file to LSP B. And you can have LSP B do the same.

A few problems with this dance:

  • Getting LSP A and LSP B to play nicely together. The LSPs have to cooperating to actually export the TMX files in a timely manner, so dancing doesn’t get in the way of the schedule. Most LSPs are fierce competitors, as the market has gotten very crowded.
  • Even if they do play nicely together (and some do!), you now have all of the translations in all of the different TMs. This includes all of the various French versions of the same segment. Instead of having a segmented, big, messy problem, you have one large big messy problem that lives in many places. While you may think that “someone” will eventually clean up this mess, trust me, you will never have the time to clean this up.
  • Exporting and importing TMs is a one-shot deal. Every time you want the TMs to be synchronized, you have to do the TMX dance.
  • Each time you do the TMX dance, your multiple TMs end up with all of those mis-matched translations in every LSPs version of your TM. You still don’t have the time to clean it up.
  • And if you have more than two LSPs, the dance becomes truly interpretive.

It doesn’t take long before the TMX dance method of synching your TMs gets very cumbersome, messy, and rather useless.

Centralized TM

There is a better solution to the TM nightmare. And that solution is to have all of your LSPs use the same TM database. To do this, you need to store the TM in a single location that is accessible to all LSPs. And you need to create and enforce a workflow that ensures that each LSP updates the single, centralized TM, rather than creating its own offshoot branch.

There are a few vendors who offer this type of shared TM workspace service. One of them, Cloudwords, is a Content Rules partner. Their solution is quite elegant. It allows you to centrally manage your TM, regardless of the translation vendor. It also ensures that each vendor uploads the project-specific TM back into Cloudwords after each project.  Cloudwords also provides an easy-to-use web interface for actively managing and sharing your TM assets across your organization.

You can take a peek at a video at Cloudwords TM Management:  http://bit.ly/QdVatU. Or just jump into a free trial or conversation. They’re happy to talk TM till the cows come home.

As you are planning your global content strategy, think carefully about your translation memory. When created, stored, and updated properly, the TM can be an important part of a better, faster, and cheaper translation process.

Val Swisher
Latest posts by Val Swisher (see all)