If a person wants to assemble an IKEA furniture, it is normal to be guided with the figures that appear in the instructions. That is, he learns to autonomously execute a task in a real environment from two-dimensional images. This requires complex abilities, such as mentally conceiving a concept and memorizing it, recognizing and overcoming possible obstacles that are found in physical space and knowing how to distinguish the pieces that are needed from other objects. These are characteristics of human learning, studied by disciplines such as cognitive sciences and neurosciences. Finding a way to replicate these characteristics in robots is one of the current challenges for different researchers and companies in the sector.
An article published this Wednesday in Science Robotics It illustrates how machines can be trained to go beyond the mere imitation of an action. In other words, we describe a computational architecture that allows a robot to learn an abstract concept (for example, to arrange disordered objects in a circle) and replicate different tasks related to that idea (such as creating a circle from different objects). colors, shapes or sizes). The authors, members of the Californian company Vicarious AI, consider the work an advance with respect to the objective that the machines have the capacity to construct "interpretable representations" and acquire "common sense".
"To teach a concept [definido en el estudio como una redescripción de una experiencia cotidiana a un nivel más alto de abstracción], we show the robot several pairs of vignettes with objects. Each pair contains an initial scene and the scene resulting from applying said concept to it ", explains Miguel Lázaro-Gredilla, co-author of the article. The researcher adds that it is necessary to show several pairs of two-dimensional images to illustrate a concept. "If in one we see a green triangle and a green square stacked on the right margin, the concept could be to stack the green objects on the right or stack the squares and triangles on the right. With multiple couples in which the same concept is exhibited, ambiguity is reduced, "he explains.
Technically, what Vicarious researchers have implemented in the machines is a Visual Cognitive Processor (PCV). It is a cognitive architecture that, from a set of instructions, can build different programs in a specific language to generate concrete actions. According to Lazaro-Gredilla, it is similar to a CPU, but with some peculiarities.
One of its main features is "a vision system capable of segmenting objects (identifying which pixels belong to each object) and linking their visual properties (color, shape, size, position)", explains the researcher. Another, he adds, is to possess "a model of world dynamics that allows us to foresee (imagine) the result of an action before executing it". In addition, "the PCV has a series of local memories in which to lodge, symbolically at the level of objects and not pixels, the imagined results of an action," he explains.
Once an abstract concept has been learned, "the robot can recognize and execute it when presented with a new initial scene with a new set of objects," continues Lázaro-Gredilla. You can also identify it if it detects another robot while executing it. "It is enough for him to make a copy of the initial scene in his imagination, manipulate that scene mentally (instead of in physical reality) and compare the result obtained in his imagination with that obtained by the other robot," he says.
To demonstrate the effectiveness of the proposed system, the authors performed different experiments with two robots as proof of concept. Both machines were equipped with a camera to detect the scenes that were and a mechanical arm to grab and drop objects. When performing tasks, the first was successful in 90% of the tests, the second in 70%.
The goal of giving robots more autonomy
Currently, explains Lázaro-Gredilla, to program machines that manipulate objects are usually indicated an exact sequence of movements to be performed. There are also "more sophisticated" robots, which can detect where an object is located and adapt the sequence of movements accordingly, the researcher details.
In his opinion, his work and that of his colleagues "goes beyond" that. As a demonstration, he cites one of the experiments carried out, in which the robot was asked to separate lemons of yellow color and green files, placed in a disorderly manner on a table. "To indicate that instruction, regardless of how many fruits there are and their positions, the programming is reduced to teaching several pairs of vignettes, with each pair containing an example of mixed limes and lemons and another of separate limes and limes," he illustrates.
"If the human brain can do it, we have to be able to replicate the process that follows, but discovering that process with the limited information available about the functioning of the brain requires a trial and error exercise that can take a long time. to evolution took millions of years "
Lázaro-Gredilla believes that this ability can be especially useful in different industrial applications. "Three-dimensional concepts such as inserting the object in the box or stacking all the boxes are very simple for a human, but from a robotic point of view they are not trivial," he says. Machines capable of them can simplify robotic assembly lines and make them reconfigurable, he argues. "The use of abstract concepts reduces the accuracy requirements, since the concept fits piece A in hole B is still valid even if the objects are in unexpected places, and the costs of reconfiguration."
The slow road of artificial intelligence
For Ismael García, from the University of Castilla-La Mancha, the study constitutes a "relevant contribution within cognitive robotics". However, the researcher believes that in this area there is still "a long way to go". In the specific case of Vicarious's work, he emphasizes, "we do not talk about multimodal learning (from different sources of information: auditory, gestural, or visual), nor about learning through reinforcement or imitation, which are clear paradigms of learning human". The expert adds that "the fusion and combination of all of them within the same model of cognitive architecture" is not covered either.
For his part, Josep Amat of the Polytechnic University of Catalonia ensures that the article "is framed in this current impulse of artificial intelligence", which has among its objectives "develop systems of correct interpretation of all types of data, decision making from the interpretations made and the development of autonomous systems capable of interacting with the environment ". However, he warns, "the advances in artificial intelligence are slower than what would be desired, more than those achieved by areas such as microelectronics, communications or robotics."
Lázaro-Gredilla is aware that they are not results obtained overnight. "If the human brain can do it, we have to be able to replicate the process that follows. But discovering this process with the limited information available about the functioning of the brain requires a trial and error exercise that can take a long time. After all, evolution took millions of years, "he says.