Artificial intelligence hides a large number of researchers who work daily to continue advancing in this field and add new functions and utilities. When we think of AI and Big Data, we imagine that behind all these advances there are engineers, mathematicians, scientists, computer scientists or programmers. And there are. But, in fact, other professionals are also needed, such as linguists, psychologists or even philosophers.
These profiles, which apparently have little to do with each other, make up the multidisciplinary teams that come to light as they deepen the day-to-day work and research of artificial intelligence.
In order to create intelligent instruments and tools it is essential that they can communicate, and it is at this point that the figure of the computational linguist appears, key in the investigation of language technologies. According to Wikipedia, the computational linguistics is an interdisciplinary field that deals with the development of formalities that describe the functioning of natural language, such that they can be transformed into programs that can be executed by a computer. In this way, linguists and specialist engineers must transform existing information, both in voice and in text, into a structured language that artificial intelligence can understand and process, and for which it can generate a response. A function in which not only are professions eminently related to science necessary, but also indispensable experts in language or behavior.
Conducting the task of converting all that unstructured information into data that can be processed is the great challenge of the natural language processing, one of the most developed activities of the AI. Currently, the PLN is one of the applications most demanded by companies that need to process and take advantage of all the information they handle in their day to day or that they store in their historical archives. Tasks such as automatic translation, detection of entities, information retrieval, automatic analysis of sentiment, extraction of main ideas from a text, detection of trends or the development of chatbots are of vital importance for many companies, because They allow you to listen and learn from your users and their behavior.
It is from the detection of these needs when the linguist, together with the rest of the team, begins with the transformation process. The starting point of any PLN project is the corpus, a set of texts, ordered or not, that serve as the basis for any linguistic or statistical analysis. One of the main tasks of linguists is the annotation, systematic and exhaustive, which turns the set of texts into an annotated corpus. For this, the linguist must make precise labeling of each term on the text. It is a costly task, but essential for the AI to start acting on this information.
Then, this corpus is introduced into linguistic engines where it is analyzed at a morphological, syntactic and semantic level by linguistic rules of different levels. Finally, at a more advanced stage, models of machine learning that offer automatically enriched texts with the correct labels. These procedures allow performing all those PLN tasks that offer a multitude of possibilities to companies, institutions or public administration based on their needs and their characteristics.
The huge variety of clients allows linguists to embark on PLN projects very different from each other. From the creation of algorithms to train chatbots, which resolve doubts and incidents, to the detection of neologisms in a language, as is the case of the project of localization of Anglicisms in the use of Spanish in the US in social networks, carried out by the Cervantes Institute of the University of Harvard in collaboration with the Institute of Knowledge Engineering (IIC).
Science and literature, despite the generalized conception that they are opposite terms, advance much faster if they work as a team. Computational linguistics is the field where this conjugation of profiles is perfectly exemplified a priori antagonistic. AI is an unstoppable technology, constantly reinventing itself and bringing great advances in all fields. One of the keys to this success is that it combines multidisciplinary teams in which all the branches add up and complement each other.
Carmen Torrijos He is a Computational linguist in the IIC.