The imagination of story tellers have for many years conjured the idea that we might have a universal translator, and in the age of AI maybe it is possible, but it is not an easy thing to achieve. In this keynote, IP partners James Tumbridge and Robert Peake explore why we still have a way to go, and why reliance on automated legal translation in particular warrants a degree of caution.

Human language is difficult to master because it does not follow strict rules, and often context changes meanings. The way we decline verbs and change language over time, moving from use of ‘thou’ to just ‘you,’ and words that sound alike but have different meaning; think of ‘by’ and ‘bye’ or ‘here’ and hear,’ then in English we have different spellings across dialects, like prioritise or prioritize, all of which makes learning to translate accurately challenging. When you stop to think about it, language is very complex. Humans often rely on visual cues and context to understand one another. For AI to translate, it needs rules and understanding, and the effort to create this is Natural Language Processing (NLP). The ability to use natural language in generative AI has given hope to the future for translations, but it is not yet perfect.

The dream of machine translation probably started in the 1950s when Claude Shannon and his wife started experiments. He started statistical extrapolation of text from relatively small samples, paired with an understanding of vocabulary and language structure. His foundational work later helped those that created Siri and other voice recognition tools. Taking this together with language focused software from dictation and translation services, and then adding Generative AI, is revolutionising what is possible.

Neural machine translation (NMT) systems are key to improving translation. NMT aims to create algorithms and make use of neural networks, to constitute foundation models. These models learn to provide translations by spotting patterns from large volumes of text in the languages they are trained on. NMT and other language models can, though, lack balance in their coverage of languages, given the corpus of data in different languages varies, making it easier to develop tools for languages with a greater library of written material. This imbalance means that translations may not be as accurate as they should be for every language, hence the universal translator not yet being a reality.

The direction of translation has also proven challenging, as there are many more requests into some languages to learn from, than others. This has meant people can find translation into some languages less reliable than others; even though you may be focused on the same pair of languages the direction of translation affects accuracy. There is a further challenge of the type of text, as there is much more training on the Latin alphabet than the Russian, Japanese or Korean, for example.

If you currently want certainty of accurate translation, then humans remain the gold standard, but things are changing. If nearly right is good enough, a few key strokes can already get you a very good translation. Whilst AI won’t be putting human translators out of a job next year, it is helping redefine the translation services industry. It’s making the entire process far more streamlined and cheaper. The cost is in the training, but once you have a good system it greatly increases the amount you can translate.

AI already makes it easy for people to access quick and relatively reliable translations, but very accurate and nuanced translation of meaning is so far beyond it. Investment is continuing, for example, Google is investing in its Neural Machine and rather than literal translation, word by word, it relies on large databases and looks for language patterns, making comparisons to produce better, more reliable translations. Apple has integrated its Live Translation feature into its latest Air Pods Pro model, with a promise to deliver real time translation of the speaker’s meaning and not only individual words.

AI may struggle with idioms, slang, humour, cultural references, and capturing specific tone or emotion, which can lead to mistranslations in creative or sensitive content. It will also take longer for rarer languages to have the data from which translations can be based because there is less content to train on. Obviously, the quality of AI translation is highly dependent on the amount of training data available; less commonly spoken languages tend to have less written material to train from, than say English, or Mandarin.

In the business world the key thing to remember is that accuracy is crucial to avoiding conflict. If you want a feel for a document, AI translation can be a great start, but do not rely on it alone to tell you what you are contracting for; translation of legal meaning from one language to another can have big impacts on its consequence, so we urge caution when it comes to contract translation use. Similarly, whilst it may be tempting to seek translation of a court judgment delivered in another language, it remains the case that it should be checked by someone with an understanding of both languages and legal systems before placing reliance on the translation. Reliance on Generative AI outputs for legal research has made headlines when non-existent cases have been cited in court proceedings; automated translation is no different, and skilled human review remains a must.

If you have questions or concerns about AI, please contact James Tumbridge and Robert Peake.