A leading American scientist has discovered that early sequences of the coronavirus genome submitted by a Chinese researcher were deleted from a shared database.
Jesse Bloom, a researcher at the Fred Hutchinson Cancer Center in Seattle, has published a paper saying he found genetic sequences taken from early coronavirus cases in China were deleted from a US National Institutes of Health database.
He was able to recover the files from cloud storage to find that some of the early cases in the Chinese city of Wuhan are genetically different from the variants that eventually spread to cause the pandemic.
Mr Bloom said the data does not shed any further light on whether the virus spread naturally from animals to humans or whether it was the result of a laboratory leak.
"This study does not provide any additional strong evidence favoring either natural zoonosis or lab accident,” Mr Bloom told CNN. “Rather, it shows that there are additional sequences from relatively early in the outbreak that are still unknown, and in some cases have mutations that suggest they are probably evolutionarily older than the viruses from the Huanan Seafood Market.”
In a pre-print paper posted on bioRxiv, which has not yet been peer reviewed, Mr Bloom wrote: “I recover the deleted files from the Google Cloud, and reconstruct partial sequences of 13 early epidemic viruses.
“Phylogenetic analysis of these sequences in the context of carefully annotated existing data suggests that the Huanan Seafood Market sequences that are the focus of the joint WHO-China report are not fully representative of the viruses in Wuhan early in the epidemic.
“Instead, the progenitor of known SARS-CoV-2 sequences likely contained three mutations relative to the market viruses that made it more similar to SARS-CoV-2's bat coronavirus relatives.”
He went on to outline the implications of his research in a lengthy Twitter thread.
“First, fact this dataset was deleted should make us skeptical that all other relevant early Wuhan sequences have been shared. We already know many labs in China ordered to destroy early samples,” he wrote.
“Second major implication is that it may be possible to obtain additional information about early spread of #SARSCoV2 in Wuhan even if efforts for more on-the-ground investigations are stymied.”
The sequences were removed in June 2020 at the request of the Chinese investigator who originally submitted them in March of that year, the NIH confirmed to CNN. It said it was standard practice to allow this.