AF affects over 58 million people globally and substantially elevates ischemic stroke risk. While direct oral anticoagulants have simplified management, existing risk-stratification tools, such as CHA2DS2-VASc, show limited predictive accuracy and overlook complex clinical interactions.
Prior machine learning approaches often lack external validation, require extensive laboratory inputs unavailable at diagnosis, or target already-treated populations, limiting real-world utility. This paper addresses these gaps by developing and externally validating interpretable machine learning models that use only age, comorbidities, and medication data to accurately predict one-year stroke risk at the time of AF diagnosis.
Study Design and Model Development
This study developed and externally validated interpretable machine learning models to predict one-year ischemic stroke risk in patients with newly diagnosed AF, using only age, comorbidities, and medication history to maximize clinical applicability.
Data were drawn from the National Taiwan University Hospital Integrated Medical Database, comprising a derivation cohort of 9,511 patients from the Taipei tertiary center and two external validation cohorts totaling 2,542 patients from regional branches in Hsin-Chu and Yun-Lin, allowing both geographic and temporal validation.
Predictors were restricted to variables routinely available at diagnosis to avoid selection bias from missing laboratory data. Logistic regression and Platt-calibrated XGBoost models were trained, with feature selection guided by Bayesian Information Criterion minimization and gain-based importance ranking, respectively.
Performance was assessed using area under the receiver operating characteristic curve, area under the precision-recall curve, Brier scores, and calibration curves, with the CHA2DS2-VASc score serving as the benchmark comparator. Decision curve analysis quantified net clinical benefit across threshold probabilities, while net reclassification improvement evaluated risk stratification gains.
Sensitivity analyses excluded stroke events occurring within three days of diagnosis to mitigate reverse causality concerns. Sex-stratified subgroup evaluations and analyses stratified by prior stroke history were conducted to ensure equitable model performance.
Long-term predictive utility was examined using Kaplan–Meier cumulative incidence curves at three- and five-year follow-up, with patients stratified into high- and low-risk groups based on model predictions and further subdivided by direct oral anticoagulant use. Finally, to demonstrate clinical translation, a web-based interactive decision support interface was prototyped, enabling visualization of individualized risk estimates to facilitate shared decision-making at the point of care.
Model Validation and Comparative Performance
The derivation cohort comprised 9,511 patients from a Taipei tertiary center, while two external validation cohorts from regional branches totaled 2,542 patients, enabling geographic and temporal validation. Laboratory data were excluded due to significant demographic differences between patients with and without available results.
Logistic regression selected nine predictors via Bayesian Information Criterion, while XGBoost selected eleven predictors based on gain-based importance. Internal validation yielded area under the receiver operating characteristic curve values of 0.915 and 0.914, respectively, substantially outperforming the CHA2DS2-VASc score (0.67).
External validation confirmed robust generalizability without overfitting, with an area under the curve ranging from 0.877 to 0.886. Sex-stratified analyses demonstrated equitable performance across subgroups. Decision curve analysis revealed superior net clinical benefit for both models across threshold probabilities of 0.1-0.7, corresponding to approximately 102 additional high-risk patients correctly identified per 1,000 individuals.
Long-term follow-up over three and five years showed that logistic regression-defined high-risk groups derived significant stroke reduction from anticoagulation, whereas CHA2DS2-VASc-defined groups displayed inconsistent, paradoxical patterns, confirming superior risk stratification and treatment guidance.
Clinical Implications and Future Directions
Both logistic regression and Platt-calibrated XGBoost models demonstrated excellent discrimination and maintained strong generalizability across geographically and temporally distinct external cohorts, significantly outperforming the CHA2DS2-VASc score.
Sex-stratified analyses confirmed equitable performance without algorithmic bias. Decision curve analysis showed consistently higher net benefit, identifying over 100 additional high-risk patients per 1,000 without increasing false positives. Long-term follow-up revealed that logistic regression-defined high-risk groups accurately identified patients benefiting from anticoagulation, whereas CHA2DS2-VASc produced paradoxical associations.
Unlike prior models requiring extensive biomarkers or imaging, these interpretable models use readily available clinical features, supporting practical implementation. A prototype web-based interface was developed to facilitate clinician engagement and prospective validation. Future validation in non-Asian populations and stroke-naïve subgroups is warranted.
Practical ML Advantage
This study demonstrates that clinically interpretable machine learning models using only age, comorbidities, and medication history significantly outperform the CHA2DS2-VASc score for predicting one-year stroke risk in newly diagnosed AF patients. Both logistic regression and Platt-calibrated XGBoost models exhibited excellent discrimination, robust external validation, and superior clinical utility without sex-based bias.
Long-term analysis confirmed improved risk stratification and anticoagulation guidance. By relying solely on readily available clinical data, these models offer a practical, scalable tool for personalized stroke prevention at the point of AF diagnosis. Future prospective validation across diverse populations is warranted to support broader implementation.
Journal Reference
Lin, J. C.-W., Chang, C.-M., Pan, H.-Y., Ho, Y.-L., Tu, Y.-K., & Lai, C.-L. (2026). Interpretable machine learning models for stroke risk prediction in patients with newly diagnosed atrial fibrillation. Npj Digital Medicine, 9(1). DOI:10.1038/s41746-026-02470-3, https://www.nature.com/articles/s41746-026-02470-3
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.