By far, translation memory (TM) was one of the most revolutionary inventions in the world of computer aided translation. Think about the beauty of it. As a translator works on content, the source and target segments (called a translation unit) are stored in a database. The translation unit is stored automatically – with no input on the part of the translator. The next time content is sent to translation, the source content is compared to the TM to see if the content has already been translated.
Does Your Content Match Your Translation Memory? It Could Cost You.
If the content is in the TM, it is considered a match. If all of the words of the source content are the same, it is considered a 100% match (100% of the words are the same). If the source content, AND the sentences before and after it are a match, it is considered an in-context exact (ICE) match. Good news! You never have to pay to have an ICE translated. It costs you (or at least should cost you) exactly $0 for context matches.
If you have a 100% match, but not an in-context exact match, you have to pay a little bit for the translation. This is because the translator needs to make sure that the existing translation is valid within the new context.
Anything less than a 100% match is called a fuzzy match. Fuzzy matches are determined by the percentage of words that are identical to the original source content. You can have a 95% match, a 90% match, a 50% match, and so on. The fewer words that match, the more money you have to pay for the segment to be translated.
It doesn’t take much to see that sending context and 100% matches lowers your translation bill. The savings can be quite substantial. You always want to send as many 100% matches to translation as possible.
From a content creation standpoint, we need to say the same thing, the same way, every time we say it. That is the only way to send 100% matches – by not changing any words at all.
In the best of all possible worlds, you single-source your content so that you only ever send a single segment to translation one time. After that, you reuse that content over and over, in whatever content deliverables you need to produce.
However, we often write the same thing over and over again. And being human, it is the rare person who can remember exactly what they wrote, word for word, from day to day.
What we need (listen up tools manufacturers) is a way for content creators to see into the TM. We need a technology that provides the writers with access to the source side of the translation unit. That way, the writer knows exactly what has already been translated.
The Future of Translation Memory
Even better (I can dream, can’t I?) would be a tool that looks at what the content creator is writing, compares it to the TM, and then automatically pushes the matched segment back to the writer. The writer can then decide if the sentence should be altered to make the 100% match or if their changes need to remain.
Imagine how much your company would save on translation if this technology was available today?
Want to quantify it? Start with your localization team. Ask for data on the fuzzy match performance of your source content. Take a look at how many fuzzy matches you are sending and the percentage match. Even better – see if you can get the actual content so that you can look at the segments themselves. That’s the best way of determining whether you can make changes to the source that will provide more 100% matches. More good news! The more 100% matches you send, the more likely it is that the context will match, too.
I truly believe that ensuring 100% and in-context exact matches is one of the last untouched frontiers for translation savings. While many people think machine translation is nirvana (another topic for another day), I think standardizing source content to increase 100% matches is really the way to go.
When not blogging, Val can be found sitting behind her sewing machine working on her latest quilt. She also makes a mean hummus.