Learning a language costs. Effort, time and sometimes money. But in the 21st century it is not always necessary to study a language to be able to communicate anywhere in the world. Simply take your cell phone out of your pocket, open a translator and write or Dictate a phrase. Immediately after, as if by magic, the smartphone It translates it into almost any language. The most popular translator is Google, which is used every day by millions of users worldwide and is able to translate more than 100 languages. But behind that magic that seems to take place inside the terminal when using it, there is a trick and it's called artificial intelligence.
For Macduff Hughes, engineering director of Google Translate, the big change in the way in which translations are made occurred in 2016. That was when Google incorporated a neural automatic translation system: "The old translation method worked phrase by phrase and word for word while the new one takes the complete sentence. " “This new system is, paradoxically, much simpler, since the previous one took into account many rules on how to join sentences and reorder words,” explains Hughes, who does not specify the number of employees working on his team.
Google Translate uses patterns from millions of existing translations on the web to help decide the best translation. The amount of translations with which the neural network has been trained determines the quality of the translation: “The more data we have, the better the translation.” Therefore, when deciding to add a language to the translator, the first step is to make sure that there is a set of reliable data on the web to train the system.
"We have to ask ourselves if there is enough data to create a model that meets our quality standards. If there is, we can usually develop it in a few months," he explains. When two languages are very different from each other, a greater amount of For example, grammatically English is very different from Chinese and Japanese, so more information is needed to obtain the same quality as when doing the same translation from English to Spanish.
To ensure that the data set with which the system is trained is of good quality, Google has human readers. Although Hughes warns: "Quality is important, but quantity always wins in the long term." While the European Union "has done a wonderful job in providing the world with translations because many documents have to be translated by law into other languages," not all languages have the same fate. There is, as Hughes acknowledges, "an imbalance between the languages represented in the translator and the number of speakers in the world."
"We have to ask ourselves if there is enough data to create a model that meets our quality standards. If there is, we can usually develop it in a few months."
In addition to quality and quantity, it is also important that translated websites deal with a wide range of topics. "Travel websites have many translations and we become very good at translating things about travel, but not so much when it comes to botany," says Hughes at an event in Zurich on artificial intelligence to which THE COUNTRY has been invited by Google To this it is added that the majority of translations available on the web are made in a professional context.The registration used, therefore, differs from the way in which users actually speak in their day to day.
Biases and translation errors
When translations are not accurate, some users alert Google through messages or on social networks. Workers can also report any errors quickly. "We don't fix every wrong translation because we like to be as strictly algorithmic as possible and let the model do its job, but sometimes we do when a translation is offensive or misleading and can cause some kind of damage," he explains.
Hughes remembers a translation error in June. Amid the protests in Hong Kong, the system translated the phrase "I am sad to see Hong Kong become part of China" to "I am happy to see Hong Kong become part of China". That is, the translation suggested in simplified and traditional Chinese turned the word "sad" into "happy." The error, which caused a stir among different users, was corrected the same day.
These mistakes are not the only problem the company has to face. The translator is sexist. For example, assume that “to doctor ” He is a male doctor. In the meantime, "to nurse”Is a female nurse. Hughes recognizes that to solve it there is still work to be done: “The basic design of machine learning systems is to find the most likely answer. But when you do this millions of times, you are reinforcing some social stereotypes. ”
To combat this kind of bias, Google Translate is about show multiple gender options when they exist. The objective is that this function, which today is only available for some words in languages such as Spanish, French or Portuguese, reaches all languages in the future.
The Google Translate engineering director sees that in the future two peoplewho speak different languages can have a conversation Totally natural in real time: "I think all the necessary pieces are there. We just need some improvements in voice recognition to work in noisy environments, in the quality of translation, in understanding the context and in the text step to voice so that it sounds more natural ”.
Will the Google translator become an alternative to language learning? Hughes, who knows German and a bit of Spanish and Welsh, considers it a "different experience." "Traveling and using the translator is much easier than learning a language, but learning a language is a very rewarding experience and there is much you can do if you are really able to speak it," he adds.
One of the goals of his team is to develop models that can be trained with a much smaller amount of data. To learn a language "you don't need to see a billion sentences", but it can be worth with "a few thousand examples and a dictionary". Get a continuous and totally instant translation – that the system starts translating even before the user Finish the sentences — and achieving the translation of several languages with a single neural network are also some of the main challenges.
Current systems serve up to five or eight languages. "In our laboratories we are trying to get a model that works for 103 languages," he says. But in the world there are many more languages than those currently supported by the Google translator –It is estimated that around 7,000-: "Our great hope is to develop models that can generalize what it means to understand and learn a language and, hopefully, move from hundreds to thousands of languages."
Artificial intelligence to detect ‘spam’
Artificial intelligence is not only behind the Google translator. It is also present in users' emails to detect spam or on your mobile keyboards to make predictions. With this technology, much more ambitious goals are also pursued. The Mountain View company promotes different projects to make life easier for people with disabilities. For example, Lookout It is an app that assists blind or visually impaired people and provides information about what is around them. It also develops applications and devices that allow people with speech difficulties to communicate. Google, like others companies like Microsoft, also uses artificial intelligence to search solutions to the environmental challenges of the planet. For example, to monitor marine life, fight against illegal fishing, keep track of different endangered animals or predict natural disasters.
Amnesty International accuses in its latest report Surveillance giants Google and Facebook threaten human rights with their business model. For the organization, this model is based on surveillance and "is intrinsically incompatible" with the right to privacy. The Mountain View company collects at all times a huge amount of user data: knows everything a user buy online, the places you visit, the apps you use and even the porn you watch –even if I do it in incognito mode-. Olivier Bousquet, chief engineer of Google AI Europe, denies any kind of surveillance and it is shielded that Google has the consent of users: “The question is: do you know or have access to the information that is being collected? The user can check on the web what data is collected and has control over them ”. In addition, it ensures that the data is always used for a specific purpose and that the user has to be informed of what is going to be done with them. "Consent is not enough if you don't have transparency," he concludes.