A coconut can be for many only a fruit. For others, it is possible that it refers to the human mind, to the imaginary being with which children are scared or to the title of a film about the Day of the Dead set in Mexico. The truth is that, depending on the context in which this word is used, it can mean much more than that. This term may even be a sign of terrorist radicalization. "Coconut [coco, en inglés] in the jihadist realm it is sometimes used to identify those Muslims who, being roasted on the outside, are white on the inside. That is to say that they sympathize with the infidel, "explains José Manuel Gómez, head of R & D in Madrid. Expert System.
The text analysis helps this company to artificially detect jihadist terminology, locate indoctrinated people through the Internet, discover misleading information and establish if different accounts on social networks are used by the same person. The technology developed by Expert System is used in projects Dante Y Trivalent, which are financed by the European Commission. Universities, police forces and European public administrations work on both initiatives with the aim of advancing the interpretation and understanding of behaviors and focuses related to terrorist threats.
This technology is used in the Dante and Trivalent projects, which are funded by the European Commission
"The hate messages in which the infidel is disqualified are the most typical and the easiest to detect," says Gómez. Precisely, the use of the term "infidel" can be an indication of this radical terminology. In the jihadist texts, weapons, explosives and the Koran are also usually present: "It is very curious because something that in principle is only a religious symbol is used to give legitimacy to what they are doing".
The artificial intelligence developed by Expert System detects, on the one hand, the appearance of jihadist terminology in these texts and, on the other, it makes a deeper analysis to classify the texts in the different narratives that the jihadist groups use to radicalize their sympathizers. From these data, the police bodies are able to detect people who may be being indoctrinated in terrorism.
"Sometimes they broadcast messages through social networks or forums with the purpose of evangelizing about the war against the infidel," says Gómez. Another narrative is that of "Islamist supremacy between groups": "Between Al Qaeda and the Islamic State [ISIS, en sus siglas en inglés] there is a great rivalry in that sense ". The classification of texts in the different narratives serves to warn the security forces if they should pay attention to a specific text or "it is simply speaking in general terms of the Quran, religion or a lot of things that are Islamic but are not jihadists. "
The company has also designed a disinformation detector that is capable of identifying the use of "deceptive language". The system locates, with 75% success, if a document is true or false. One of the great advances of this detector is that it can analyze information from any domain, regardless of whether the system has been previously trained to recognize it.
In the Dante project, three major data modalities are analyzed: text, audio and video. Live videos are an example of the need to introduce artificial intelligence mechanisms to detect content in real time. "The bottleneck is precisely the manual annotation of those videos. For example, you have to take note and delimit within the frame where there is a terrorist, a jihadist flag, a weapon or a horse. Once you have that annotated corpus you can now train your model and analyze other videos in real time, "explains Gómez.
The key would be, he argues, to combine the analysis of text, audio and video so that when the Kalashnikov concept is detected in a text, it is equivalent to the image of the rifle that appears in a video, to the audio of the shots and to the entity represented on the kalashnikov in the arm of knowledge. "When you are able to do that, you are able to have a 360-degree view of the entities and the discourse you want to analyze," concludes the Madrid R & D executive of Expert System.
Gómez gives as an example the false reviews that can be found in web pages of hotel reservations: "Although it is something very difficult to detect, the way in which the language is expressed is subtly different from a reliable language". In the jihadist field, this tool is useful "to identify cases in which an allegedly jihadist author is not and is using a social platform to send a message that does not correspond to his experience." "This helps the security forces to focus their efforts on people who are really dangerous," says the Madrid R & D official of Expert System.
To know if someone is recruiting other people from different profiles, the company has developed a stylometry analyzer capable of discovering patterns on how a person uses language in their communications. In this way, they can find out if several accounts in social networks belong to the same person. In addition, this analysis allows predicting "with high reliability" characteristics of the person who controls an account in social networks such as their age or gender.
Capture in social networks
The capture through online mechanisms "every time is more effective": "You only have to see the amount of publications fully open and in English that have these groups." The objective of the jihadists is, according to Gómez, "to have access to as many people through the most direct possible channels". Therefore, they use social networks such as Twitter or Facebook to spread their message. "In addition, more and more radicalizing activity is being detected on messaging platforms. Not Whatsapp, because that is monitored, but in others like Telegram, "says Gomez.
In December 2016, Twitter, Facebook, YouTube and Microsoft made public their union against terrorism. According to the agreement they signedWhen one of the four companies identifies a publication with extremist content in its networks, it registers it in the database that it shares with the others. In this way, the others can use that information to locate and remove that same content from their platform.
Besides, the European Comission launched in March 2018 a clear message to Facebook, Google, YouTube and the rest of large Internet platforms: these types of publications must be deleted within one hour of being notified by the police authorities or Europol. Gómez says that there is more and more control and the accounts from which terrorism is advocated remain less open.
If there are clear indications that there is radicalization on the part of a certain individual, "the police do not have much problem to obtain the authorization of these companies". But the problem comes when the evidence is not so clear. "How much privacy are we willing to give up in order to get the safest environment possible? It is a complicated debate in which the security forces, companies and citizens have to participate ", reflects Gómez.
Limitations of artificial intelligence
To train artificial intelligence, the company has extracted terminology from magazines such as Dabiq Y Rumiyah, which are public and "characterize very well the jihadist message". For the company, it would be ideal to have access to publications that are removed from social networks. "One of the main problems we have is to find a large enough corpus of these types of messages that can be used to train a system with these characteristics," says Gómez.
Added to this is that artificial intelligence systems are trained with data that may be conditioned by human prejudices. Therefore, Gómez argues that "in sensitive domains such as these" are the people who make the final decision: "The idea is that this technology alleviates the workload that has for example the police. That is, to help them do their work more quickly and with less effort, but not to make decisions for them. "