This novel technology, termed CCADDAS (Clustering/Classification of All events, Dimensionality reduction, Downsampling, and Aberrancy Scaling), automates and compresses flow cytometry (FC) data files to cut manual analysis time by 90% without compromising diagnostic accuracy.
A Disease That Lingers Beneath the Surface
In the western hemisphere, CLL is the most common form of leukemia; in 2025 alone, approximately 24,000 new cases were diagnosed in the United States.
While many patients achieve complete remission following therapy, residual cancer cells can persist at trace levels undetectable by standard clinical assessment. This lingering presence, known as measurable residual disease (MRD), is an established independent predictor of disease progression and overall survival, making its accurate detection critical to treatment decision-making.
FC is the most widely used method for MRD testing due to its accessibility, speed, and cost-effectiveness. However, conventional FC analysis is a manual process. Each sample file can contain over one million cellular events, requiring expert-level gating across multiple two-dimensional plots to isolate leukemic populations from normal background cells.
This process averages around nine minutes per case and requires advanced software and computational infrastructure. As a result, FC-based MRD testing remains largely confined to large academic centers and specialist reference laboratories, leaving most clinical settings unable to offer it reliably.
The core challenge the researchers set out to address is therefore not merely technical but operational: how to make high-sensitivity MRD testing faster, simpler, and more accessible without sacrificing the diagnostic rigor it demands.
How CCADDAS Transforms Raw Data
The CCADDAS pipeline was deployed on Google Vertex AI and trained using just 29 negative control samples (15 bone marrow aspirates and 14 peripheral blood specimens), each manually annotated by expert analysts across 13 defined cell types.
This remarkably small training requirement is one of the pipeline's key practical advantages, as it allows adaptation to new panels and platforms with minimal overhead.
When a raw FC file enters the pipeline, it undergoes a sequential series of automated transformations. First, FlowCut removes acquisition errors. Then, phenotyping by accelerated refined community-partitioning (PARC) clusters all cellular events, followed by uniform manifold approximation and projection (UMAP) for dimensionality reduction.
A deep neural network (DNN) classifier, trained on the expert-annotated controls, then automatically assigns each event to one of 13 defined cell classes, with a macro-average F1 score of 0.93.
The pipeline's most distinctive feature is its cluster-informed down sampling strategy. Rather than uniformly reducing data, it caps each cluster at 5000 events while leaving smaller clusters completely intact. This preserves the critical minority of leukemic cells while stripping out the vast redundancy of normal background events.
Want to save for later? Click here.
The result is a file reduced from an average of 1.1 million events to approximately 165,000, cutting file size from 62.1 to 15.2 MB, an 85% and 78% reduction, respectively. These compact, AI-annotated files are compatible with any standard FC software, removing the need for specialized platforms.
A computed "aberrancy scale" parameter further simplifies MRD identification by scoring each cell cluster based on how far it deviates from the negative-control baseline. This single composite metric outperformed every individual antigen marker in distinguishing MRD events from benign B cells, achieving an area under the curve (AUC) of 0.98.
Performance Across Two Cohorts
The pipeline was validated across a research cohort of 227 samples and a subsequent validation cohort of 192 samples. Across both cohorts combined, CCADDAS achieved 100% positive agreement with conventional analysis for MRD at or above the recommended clinical sensitivity threshold of 0.01%, and 100% negative agreement across 162 MRD-negative samples. For the more demanding sub-threshold cases below 0.01%, positive agreement reached 84%.
Quantitative concordance was excellent, with R² values of 0.98 and 0.9975 in the research and validation cohorts, respectively. Crucially, the pipeline's aberrancy scale successfully flagged atypical CLL immunophenotypes without requiring those variants to be represented in the training data.
Manual analysis time fell from a mean of nine minutes using conventional gating to just 0.9 min using AI-enhanced files, a 90% reduction confirmed by two independent frontline laboratory technologists after only two hours of training.
The AI-enhanced workflow accelerated the identification of suspicious populations, but the flow cytometry plots required review and interpretation for final MRD assessment.
Toward Broader Access in Leukemia Care
CCADDAS represents a meaningful advance in making high-sensitivity CLL MRD testing clinically viable beyond specialist centers. By producing compact, software-agnostic files enriched with AI-derived parameters, the pipeline dramatically lowers the expertise barrier and time burden of FC-based MRD analysis.
The authors note current limitations, including a pipeline runtime of 15 to 20 minutes per case, which prevents immediate post-acquisition analysis, and the need for external validation across other laboratory platforms.
Future work aims to make CCADDAS available as a cloud-based service that other laboratories can train, validate, and deploy independently for MRD testing across a range of hematologic malignancies.
Journal Reference
Chiu et al. (2026). Artificial intelligence-enhancement of flow cytometry data accelerates the identification of measurable residual chronic lymphocytic leukemia. Leukemia. DOI:10.1038/s41375-026-02986-3, https://www.nature.com/articles/s41375-026-02986-3
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.