AI Reveals Enzyme Functions in E. coli Proteins

Despite being extensively studied, the functions of 30% of the proteins comprising E. coli remain elusive. To address this, researchers employed artificial intelligence (AI) to uncover 464 types of enzymes within these unknown proteins. Subsequently, three protein types predicted by the AI were successfully identified through in vitro enzyme assays.

AI Reveals Enzyme Functions in E. coli Proteins
The structure of DeepECtransformer’s artificial neural network. Image Credit: Korea Advanced Institute of Science & Technology.

On the 24th, 2023, KAIST (Korea Advanced Institute of Science and Technology) announced a breakthrough by a joint research team, including Gi Bae Kim, Ji Yeon Kim, Dr Jong An Lee, and Distinguished Professor Sang Yup Lee from the Department of Chemical and Biomolecular Engineering at KAIST, along with Dr Charles J. Norsigian and Professor Bernhard O. Palsson from the Department of Bioengineering at UCSD.

The team developed DeepECtransformer, an AI capable of predicting enzyme functions from protein sequences. This AI-based prediction system allows for the rapid and accurate identification of enzyme functions.

Enzymes play a crucial role in catalyzing biological reactions, and understanding the function of each enzyme is vital for comprehending the diverse chemical reactions and metabolic characteristics within living organisms.

The Enzyme Commission (EC) number, a classification system devised by the International Union of Biochemistry and Molecular Biology, aids in understanding the metabolic traits of various organisms. However, current technologies lack the ability to swiftly analyze enzymes and their corresponding EC numbers in a genome.

While various deep learning-based methodologies have been developed for analyzing biological sequences and predicting protein functions, many suffer from the "black box" problem, where the AI’s inference process cannot be interpreted. Previous AI-driven prediction systems for enzyme functions have also been reported. Still, they fail to address the black box issue or provide a fine-grained interpretation of the reasoning process.

The joint research team tackled this challenge by creating DeepECtransformer, an AI incorporating deep learning and a protein homology analysis module for predicting enzyme functions.

The transformer architecture, commonly used in natural language processing, was applied to extract crucial features related to enzyme functions within the entire protein sequence. This approach enabled the accurate prediction of EC numbers, with DeepECtransformer capable of predicting a total of 5360 EC numbers.

An in-depth analysis of the transformer architecture revealed that, during the inference process, the AI relies on information about catalytic active sites and cofactor binding sites—essential elements for enzyme function.

By delving into the black box of DeepECtransformer, the researchers confirmed that the AI autonomously identified features crucial for enzyme function during the learning process.

By utilizing the prediction system we developed, we were able to predict the functions of enzymes that had not yet been identified and verify them experimentally.

Gi Bae Kim, Study First Author, Korea Advanced Institute of Science & Technology

Kim added, “By using DeepECtransformer to identify previously unknown enzymes in living organisms, we will be able to more accurately analyze various facets involved in the metabolic processes of organisms, such as the enzymes needed to biosynthesize various useful compounds or the enzymes needed to biodegrade plastics.”

DeepECtransformer, which quickly and accurately predicts enzyme functions, is a key technology in functional genomics, enabling us to analyze the function of entire enzymes at the systems level. We will be able to use it to develop eco-friendly microbial factories based on comprehensive genome-scale metabolic models, potentially minimizing missing information of metabolism.

Sang Yup Lee, Professor, Korea Advanced Institute of Science & Technology

The collaborative efforts of the research team in developing DeepECtransformer are detailed in their research paper authored by Gi Bae Kim, Professor Sang Yup Lee from the Department of Chemical and Biomolecular Engineering at KAIST, and their colleagues.

The study received financial support from the “Development of next-generation biorefinery platform technologies for leading bio-based chemicals industry project (2022M3J5A1056072)” and the “Development of platform technologies of microbial cell factories for the next-generation biorefineries project (2022M3J5A1056117)” funded by the National Research Foundation and backed by the Korean Ministry of Science and ICT. Distinguished Professor Sang Yup Lee from KAIST led these projects.

Journal Reference:

Kim, G. B., et al. (2023) Functional annotation of enzyme-encoding genes using deep learning with transformer layers. Nature Communications.


Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Azthena logo powered by Azthena AI

Your AI Assistant finding answers from trusted AZoM content

Azthena logo with the word Azthena

Your AI Powered Scientific Assistant

Hi, I'm Azthena, you can trust me to find commercial scientific answers from

A few things you need to know before we start. Please read and accept to continue.

  • Use of “Azthena” is subject to the terms and conditions of use as set out by OpenAI.
  • Content provided on any AZoNetwork sites are subject to the site Terms & Conditions and Privacy Policy.
  • Large Language Models can make mistakes. Consider checking important information.

Great. Ask your question.

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.