Protein is said to be the basis of life. The flexible and stable protein structure governs the function of organisms. The proteins’ spectral response signals can be referred to as a protein skeleton, which can uncover the accurate protein structure via hypothetical simulation.
Conversely, the proteins’ structure is very complex and dynamic, requiring a vast number of highly accurate hypothetical calculations of quantum chemistry. As a result, the theoretical understanding of protein spectra continued to be difficult and challenging over a long term, limiting the discovery of protein structure and the precise analysis of spectra.
In the simulation of spectral theory, understanding the optical fingerprints of protein skeleton and avoiding extremely costly quantum chemical computations happen to be a significant scientific topic, with chemical connotations utilizing the arbitrary forest technique. The combination of quantum chemistry and artificial intelligence (AI) offers an efficient tool for estimating the proteins’ optical characteristics.
AI technology has been extensively used in many different domains to bring down the computational complexity. In the recent past, Professor JIANG Jun from Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China (USTC) of the Chinese Academy of Sciences, in association with Professor Shaul Mukamel from the University of California, Irvine, and Professor LUO Yi from USTC, determined the structure-property association between the properties and structure of protein-peptide bonds by applying AI machine learning’s neural network technology. The researchers’ finding has been reported in PNAS.
The study reduced the computational difficulty by many numbers of times and also allowed the researchers to effectively predict the peptide bonds’ ultraviolet (UV) spectra and uncover the structure-property relationships and structure descriptors.
The team initially acquired 50,000 groups of peptide bond model molecules with varied configurations through quantum chemistry calculation and molecular dynamics simulation at 300 K.
Machine learning algorithm was used to select the dihedral angle, bond angle, bond length, and charge data as descriptors. Big data training with neural network was used to establish the structure-property association between the ground state structure and the excited state characteristics of the peptide bond.
On the basis of the trained machine learning model, the excited state properties and the ground state dipole moments of the peptide bonds are estimated. Following this, the peptide bonds’ UV absorption spectra are predicted.
To validate the transferability and robustness of the machine learning model, the peptide bonds’ UV absorption spectra at 400 K and 200 K were predicted on the basis of the machine learning model achieved at 300 K. The outcomes were observed to correlate well with simulations using the TDDFT (time-dependent density-functional theory).
This is the first-ever study in which AI technology has been employed in theoretical prediction and calculation of protein spectroscopy. A vast number of data are achieved via theoretical computation, and AI technology is used for training and establishing the relationship between the structure and property. The ultimate model is used for prediction, which offers a novel concept for simulating the proteins’ spectrum.
This work ascertains the viability and benefits of machine learning to replicate the UV absorption spectra of protein-peptide bond skeleton, and, as a result, an understanding of optical fingerprints of protein will turn out to be more effective and easier.
Professor JIANG’s group has devoted themselves to advancing the use of machine learning technology in the domain of quantification, rendering it an essential tool to resolve quantification challenges.