A majority of the new infectious diseases of humans (for instance, COVID-19) are zoonotic — caused by viruses found in other animal species. Detecting high-risk viruses sooner can advance research and surveillance priorities.
In a study published in the September 28th issue of PLOS Biology by Nardus Mollentze, Simon Babayan,and Daniel Streicker from the University of Glasgow, United Kingdom, the team recommends that machine learning (a type of artificial intelligence (AI)) using viral genomes could be used to predict the probability that any animal-infecting virus will pass on a disease to humans, given biologically applicable exposure.
It is a huge challenge to identify zoonotic diseases before emergence because only a small minority of the projected 1.67 million animal viruses can pass on a disease to humans. To design machine learning models by means of viral genome sequences, the scientists first compiled a dataset of 861 virus species from 36 families.
They then constructed machine learning prototypes, which assigned a probability of human contagion derived from patterns in virus genomes. The team applied the best-performing prototype to examine patterns in the predicted zoonotic potential of further virus genomes sampled from a variety of species.
The scientists learned that viral genomes may have generalizable characteristics that are free of virus taxonomic relationships and may preadapt viruses to cause infections in humans. They were able to build machine learning prototypes that could identify candidate zoonoses with the help of viral genomes.
The prototypes have their limits as computer models are just a primary step of detecting zoonotic viruses that could possibly infect humans. Viruses identified by the prototypes will necessitate confirmatory laboratory testing before following up with huge additional research investments.
Furthermore, while these prototypes estimate whether viruses may be able to infect humans, the capacity to infect is merely one part of the extensive zoonotic risk, which is also affected by the virus’ virulence in humans, ability to transmit between humans and the ecological circumstances at the time of human exposure.
According to the authors, “Our findings show that the zoonotic potential of viruses can be inferred to a surprisingly large extent from their genome sequence. By highlighting viruses with the greatest potential to become zoonotic, genome-based ranking allows further ecological and virological characterisation to be targeted more effectively.”
These findings add a crucial piece to the already surprising amount of information that we can extract from the genetic sequence of viruses using AI techniques. A genomic sequence is typically the first, and often only, information we have on newly-discovered viruses, and the more information we can extract from it, the sooner we might identify the virus’ origins and the zoonotic risk it may pose.
Simon A. Babayan, Researcher, University of Glasgow
“As more viruses are characterized, the more effective our machine learning models will become at identifying the rare viruses that ought to be closely monitored and prioritized for preemptive vaccine development,” he added.
Mollentze, N., et al. (2021) Identifying and prioritizing potential human-infecting viruses from their genome sequences. PLOS Biology. doi.org/10.1371/journal.pbio.3001390.