Google works on technologies for make video calls more accessible and has developed a new system that allows detecting in real time when a participant uses the sign language, with the aim of highlighting them in group video calls.
Most video calling services use systems to highlight people who speak aloud in group meetings, which is inconvenient for people with hearing problems when communicating using sign language.
To solve this problem, a team of researchers from Google research has developed a real-time sign language detection model based on the pose estimates that you can identify people as speakers while communicating with this language.
The system developed by Google, presented at the European computer vision conference ECCV’20, employs a lightweight design that reduces the amount of CPU load required to run it, so as not to affect the quality of calls.
The tool uses an arm and hand pose estimation model, known as PoseNet, which reduces the image data to a series of markers on the user’s eyes, nose, hands and shoulders, among others, so that movement is also detected.
The Google model shows about 80% effectiveness in detecting people who speak sign language when it uses only 0.000003 seconds of data, while if the previous 50 frames are used, the effectiveness rises to 83.4%.
In addition, the researchers have added an additional layer to the model, long and short term memory architecture, which includes “memory of previous time steps, but without setback “, and with which it achieves an effectiveness of 91.5% in just 3.5 milliseconds.
To improve the accessibility of videoconferencing platforms, the researchers have made their tool compatible with them, so that it can be used to designate those who use sign language as ‘speakers’.
This system emits ultrasonic sound waves when it warns a person using this language, so that people cannot perceive them but their speech detection technologies can, thus highlighting the user in video calls.
Researchers have published open source on the platform GitHub their detection model and hope their technology can be “leveraged to allow sign language speakers to use video conferencing more conveniently.”