A American researcher ensures that, while searching for files stored in the cloud of Google, managed to recover up to 13 sequences of the virus from the covid-19 that had disappeared from the database mysteriously last year.
According to The New York Times, about a year ago, the genetic sequences of more than 200 virus samples from the first cases of Covid-19 in Wuhan (China) disappeared from a scientific database on the internet.
Now, by connecting files stored in the Google cloud, a researcher in Seattle (USA) ensures that has recovered 13 of those original sequences, which adds data to discern when and how the virus could have spread from a bat or other animal to humans, says the newspaper.
The new analysis, released Tuesday, reinforces theories that a variety of coronavirus may have been circulating in Wuhan before the initial outbreaks linked to animal and seafood markets in December 2019.
While the administration of the US president, Joe biden, investigate the controversial origins of the virus, known as SARS-CoV-2, this study does not reinforce or rule out, for now, the hypothesis that the pathogen was leaked from a famous Wuhan laboratory.
But it raises questions about why the original sequences were removed and suggests that there may be more revelations that can be retrieved from “the most remote corners” of the Internet, the newspaper says.
“This is great detective work without a doubt, and it significantly advances efforts to understand the origin of SARS-CoV-2.“Michael Worobey, an evolutionary biologist at the University of Arizona, who was not involved in this study, told the New York Times.
Jesse Bloom, the virologist at the Fred Hutchinson Cancer Research Center who developed this report, called the deletion of these sequences suspicious.
“It seems likely that the sequences were removed to hide their existence.“He wrote in his report, which has not yet been peer-reviewed or published in a scientific journal, the newspaper acknowledges.
Bloom and Worobey belong to an independent group of scientists who have called for more research into how the pandemic began.
In a letter published in May, they both complained that there was not enough data to determine if the virus was more likely to spread from a laboratory or jump to humans through contact with an animal infected outside of that facility.
While Bloom was reviewing the covid genetic data that had been published by various research groups, he came across a March 2020 study on a spreadsheet that included information on 241 genetic sequences collected by scientists at Wuhan University.
That spreadsheet had been uploaded to an online database called the Sequence Read Archive, managed by the US government’s National Library of Medicine.
But when Bloom searched the database for the Wuhan sequences earlier this month, she no longer “found any items”.
Puzzled, he went back to the spreadsheet in search of more clues and did a profuse investigation, the New York newspaper cites, and found no answer to why the sequences had been uploaded to the Sequence Read Archive and later disappeared.
Nevertheless, the expert has managed to recover 13 of those lost sequences in the cloud and, after combining them with other published reports from the first coronaviruses, he remains hopeful of advancing in the construction of the SARS-CoV-2 family tree.