New AI Tool Reveals Hidden Signals That Control Cell Behavior

Researchers have developed a new artificial intelligence tool called APOLLO that helps scientists connect information from genes, proteins, and cell structures, revealing hidden relationships in how cells function and even predicting biological data that was never measured.

DNA strand and binary coding, scientific data processing image.

Study: Partially shared multi-modal embedding learns holistic representation of cell state. Image Credit: vectorfusionart/Shutterstock.com

The system, described in Nature Computational Science, offers a new way to integrate different types of single-cell biological data. By combining these data sources in a more interpretable framework, APOLLO allows researchers to see which biological signals are shared across measurements and which are unique to specific cellular processes.

Making Sense of Complex Single-Cell Data

Advances in single-cell technologies now allow researchers to measure multiple types of biological information from individual cells at once. These measurements can include gene expression, protein levels, chromatin accessibility, and imaging data. Together, they provide an unusually detailed view of how cells behave.

However, analyzing this information remains difficult. Many existing computational methods treat each data type separately and only compare results afterward. Other approaches combine the data into a single representation, but often do so in ways that are difficult to interpret.

As a result, researchers can struggle to determine which biological signals are shared across different measurements and which reflect unique cellular processes.

APOLLO was developed to address this problem by separating shared information from modality-specific signals during data integration. This structured approach allows researchers not only to combine multiple data sources but also to understand how each contributes to the overall picture of a cell’s state.

How the APOLLO Framework Works

APOLLO integrates multimodal single-cell data using separate neural networks, known as autoencoders, for each data type. These networks learn compressed representations, called latent spaces, that capture patterns in the data.

The model organizes these representations into three components:

  • x1 shared latent space containing information common across data types
  • x2 modality-specific spaces that capture features unique to each dataset

During training, the system learns these latent spaces through a process called latent optimization. Random noise is added during training to improve generalization, while regularization penalties prevent the values in the latent spaces from growing excessively large.

For sequencing-based datasets, the model reconstructs gene expression, chromatin accessibility, and protein abundance from these latent spaces. In SHARE-seq experiments described in the study, the shared space contained 50 dimensions, while modality-specific spaces ranged from 20 to 30 dimensions.

Training the model on 85 % of the dataset required roughly 35 hours on a GPU.

For imaging datasets, such as multiplexed tissue imaging and data from the Human Protein Atlas, the architecture incorporates convolutional layers and trainable protein identifiers. In this case, the shared latent space expands to 1024 dimensions to capture complex image features.

Once the latent spaces are learned, the second stage of training teaches encoders to infer these representations directly from input data. This step allows the model to process new samples efficiently and enables cross-modality prediction, where information from one data type can be used to estimate another.

Testing the Model on Real Biological Data

The researchers tested APOLLO across several datasets to evaluate its ability to separate shared and modality-specific information.

In simulated datasets with known underlying structure, the model correctly identified relationships between latent variables across multiple increasingly complex scenarios.

When applied to SHARE-seq data, which measures chromatin accessibility and RNA expression simultaneously, the modality-specific spaces captured meaningful biological signals. The RNA-specific space was enriched with genes involved in the cell cycle, while the ATAC-specific space highlighted transcriptional regulators.

On CITE-seq datasets, which combine RNA sequencing with protein measurements, APOLLO was able to separate biological signals from experimental batch effects. The shared space grouped cells by type rather than batch, while RNA-specific spaces captured batch variation. Existing methods, such as Seurat’s weighted-nearest neighbor approach, were unable to separate these signals as clearly.

The model was also tested on multiplexed imaging data, where it predicted protein staining patterns using chromatin information alone. The predicted protein images performed nearly as well as real images in phenotype classification tasks.

Further analysis revealed how different latent spaces captured distinct cellular features. For example, heterochromatin volume appeared in the shared space, while the number of γH2AX DNA-damage foci was specific to the protein modality.

Linking Cell Structure to Protein Localization

In experiments using Human Protein Atlas data, APOLLO identified connections between cellular morphology and protein localization.

Different proteins showed distinct relationships with structural features of the cell. The localization of DNA-damage binding protein 1 (DDB1), for example, correlated with endoplasmic reticulum morphology, while the protein CLNS1A was associated primarily with nuclear features.

These results demonstrate how multiple cellular compartments can independently influence where proteins appear within the cell.

Toward a More Complete View of Cell State

Overall, APOLLO provides a framework for integrating multimodal single-cell data while keeping the results interpretable. By separating shared and modality-specific signals, the system can predict missing biological measurements and reveal how different cellular features contribute to observed phenotypes.

Although the authors note that the latent optimization approach still lacks formal theoretical guarantees, the framework represents a step toward more transparent AI tools for biological research.

As single-cell technologies continue to expand, methods like APOLLO may help researchers build a more complete picture of how genes, proteins, and cellular structures interact to shape cell behavior.

Journal Reference

Zhang, X., Shivashankar, G.V. & Uhler, C. Partially shared multi-modal embedding learns holistic representation of cell state. Nat Comput Sci (2026). DOI:10.1038/s43588-025-00948-w. https://www.nature.com/articles/s43588-025-00948-w

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2026, March 13). New AI Tool Reveals Hidden Signals That Control Cell Behavior. AZoRobotics. Retrieved on March 13, 2026 from https://www.azorobotics.com/News.aspx?newsID=16356.

  • MLA

    Nandi, Soham. "New AI Tool Reveals Hidden Signals That Control Cell Behavior". AZoRobotics. 13 March 2026. <https://www.azorobotics.com/News.aspx?newsID=16356>.

  • Chicago

    Nandi, Soham. "New AI Tool Reveals Hidden Signals That Control Cell Behavior". AZoRobotics. https://www.azorobotics.com/News.aspx?newsID=16356. (accessed March 13, 2026).

  • Harvard

    Nandi, Soham. 2026. New AI Tool Reveals Hidden Signals That Control Cell Behavior. AZoRobotics, viewed 13 March 2026, https://www.azorobotics.com/News.aspx?newsID=16356.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.