TMNT: Translation Memory and Neural Translation

From the dawn of civilization, translation fascinates humanity’s desire to bring ideas across language and cultural barriers. Historians have found the Akkadian translation of the Sumerian Epic of Gilgamesh to be the earliest translated text. (No, I’m not referring to Don Lee in the Marvel Eternals, and no he is not Wong from Dr. Strange).

Closer to modern history, linguists raced to decipher the Egyptian hieroglyphs on the Rosetta stone.

Today, translations in the forms of ciphers have sent nations rushing to develop automatic translator devices, like the Engima in World War II and IBM’s Thinking Machine during the cold war.

“Does this mean the end of human translators? Yes for scientific and technical material but as regards poetry and novel, no I don't think we'll ever replace these translators” - Paramount News (1954)

Computer science has advanced far from the 1950s. With the recent renaissance of deep learning, state-of-the-art machine translation systems have achieved ‘human parity’, reached translation quality comparable to human professionals, and correctly translated the essence of a French poem.

Is this the final frontier of machine translation?

Technologists, scientists, and big tech companies would use any opportunity to tout any record-breaking achievements. And this is the point where articles would normally do some fear-mongering, “The end of human translators” or “Translators replaced by machines in the fourth industrial revolution”.

But this isn’t that kind of article. Instead, I would like to introduce readers to the simple notion of Translation Memory (TM).

There is no doubt that machine translation makes mistakes and human intervention is absolutely necessary for high fidelity translations (for now). Translation applications range from non-critical translations of Zelda games to precarious situations where a doctor needs to translate medical reports to give an accurate prognosis.

The correct translation is critical, especially in medical and pharmaceutical translations that require specialist world knowledge.

For example, a machine without some knowledge base would never be able to properly translate “przedawkowanie paracetamolu” (Polish) to “Acetaminophen/Tylenol overdose” (American English); “paracetamolu” would have translated to “Paracetamol” which is common in British English.

For these situations, a human is needed to edit the machine translation, or a good Translation Memory should be able to take care of the terminology replacement.

A machine … would never be able to translate “przedawkowanie paracetamolu” (Polish) to American English.

Translation Memory is just a database

The simplest form of Translation Memory (TM) is a database of translated texts curated by human translations. Typically, before a translator translates a document, they use translation editing software that first tries to search for matches in the TM database and pre-populates the translations for segments that are perfect or near-perfect matches.

Translation Memories are a very useful tool for humans and machines, they usually feature:

Smart Remembering: A translator has previously translated, “水のように” to “be like water” and “火“ to “fire”, now the TM is able to find translations for “火のように”.
Reduce Repetition: Imagine the boredom of translating and re-translating websites and contract boilerplates; or translating “Link… Ganon's power grows...” every blood moon.
Introduce Knowledge: Other than medical scenarios illustrated above, there are other translation jobs that have equally high stakes that require specific terminological translations, e.g. getting sued for mistranslating car manuals (technical knowledge) or translation errors that could have started a war (cultural knowledge).

Wait a minute, isn’t TM just the training data for Machine Translation?

Yes, it can be. But it can be a lot more than just the training data. There are several scenarios that TM can integrate with machine translation (MT) and it is not just the training data. Consider the following:

The TM is only available after the MT model is trained
The TM has constant updates and additions/deletions
The TM is used to correct MT mistakes

In the first scenario, companies and individuals that do not build their own machine translation engines have no other choice but to plug in the TM as an ad-hoc if-else, e.g.

from aomame import GoogleTranslator


gt = GoogleTranslator(host="translation.googleapis.com", key="*******")

def translate(text, source_lang, target_lang, tm):
    if text in tm:
        return tm[text]
    else:
        return gt.translate(text, source_lang, target_lang)

tm = {"przedawkowanie paracetamolu": "Tylenol overdose"}

input = "przedawkowanie paracetamolu"

translate(input, "pl", "en", tm)

For the second scenario, imagine if the medical director decrees that all documents are to use generic drug names, i.e. “Acetaminophen” instead of “Tylenol”. Even if Google somehow managed to get the right translation to the example above at some point, you still can’t go to the Google office and enforce only generic drug names in the translation.

And the last scenario, after expanding all your means of training/tuning or complaining to Google, there is no way for the model to learn the right translation for “przedawkowanie” to “Acetaminophen”, you would have to resort to using the TM on-top of the MT for specific translations.

If the model ain’t learning, you ain’t tuning hard enough

It’s possibly true that the model will eventually learn the right translation after concocting the right mixture of training data with the TM and turning the knobs on the hyperparameter ham-radio. But at what cost would it be to fix that particular translation? One should consider the ROI of:

Infrastructure effort and computing cost of setting up a model tuning mechanism
Human effort of hyperparameter tuning or writing the code to tune hyperparameters
The time it takes to deliver the right translations to the user

Are there sentences that machine translation just can’t get right, no matter how much data/tuning you throw at it?

Regardless of the task, there will always be a data point that a machine can’t get right, especially when humans sometime struggle too. The “Chihuahua or Muffin” problem exists in machine translation too.

Shiba or marshmallow ? - Karen Zack,

@teenybiscuit (2016)

Since the days of the GALE project, we have understood that some texts are harder to translate than others, most notably web text; the misspelling prone comments, abbreviations, slang, and fat-fingers “covfefe” make translations of web texts challenging.

Why does Google Translate do so well on web texts translations?

The magic in machine learning is often data and indirectly, human-created data. Have you wondered why there is sometimes a human figure symbol next to the translation on Google?

That is an example of ad-hoc translation memory usage in machine translation. “Covfefe” was stored as a human validated translation and most probably it is a frequently translated word and Google wants to enforce the word preservation as the right translation.

Even if public translation APIs don’t explicitly tell you that humans curate their translation data behind the scenes, data cleaning is critical to a state-of-the-art machine learning model. So much so that there is a specific translation data cleaning shared task.

Summary: Translation Memory (TM)

Expectation management has been a bane to the existing hype around MT systems stealing jobs from human translators since the 1950s. While the NLP/MT technology is accelerating at an unprecedented pace, languages and translations will always contain nuances that even humans find hard to grasp.

As we advance the state of machine translation, translation memory has its place in todays’ translation tech stack that benefits MT users and human translators. Even if tech giants don’t explicitly tell you humans creating data are the key ingredients that make MT possible, they definitely do hire lots of translators indirectly through buying language data brokers.