Instead of learning from murmurs alone, the model was trained using echocardiogram-confirmed labels. Tested on 1767 patients, the neural network achieved 83 % accuracy, exceeded the performance of general practitioners (GPs), and detected severe valve disease with over 94 % sensitivity, offering a promising path toward low-cost, high-scale screening.
Background
Valvular heart disease, often described as the “next cardiac epidemic”, is an increasing concern in ageing populations and a leading contributor to heart failure. More than half of cases remain undiagnosed, largely because early symptoms mimic routine ageing or reduced fitness. This delay in referral often translates to poorer surgical outcomes.
Although echocardiography is the diagnostic benchmark, it is too resource-intensive for widespread screening. Primary care instead relies on GP auscultation, yet sensitivity remains modest. Earlier AI tools focused on murmur detection alone, limiting accuracy because they were trained on small datasets without direct echocardiographic confirmation.
To bridge these gaps, the current study introduced a neural network trained directly on echocardiography-labelled data from a large, diverse cohort. By learning subtle acoustic markers that extend beyond human-audible murmurs, the algorithm was designed to support earlier and more reliable detection of VHD.
Study Design and Methods
Researchers developed an AI system for detecting VHD using electronic stethoscope recordings paired with echocardiographic reference diagnoses. Data was gathered from 1767 patients across three United Kingdom (UK) studies, including two hospital cohorts undergoing routine echocardiography and the community-based OxVALVE study. Exclusion criteria included previous valve surgery, pregnancy, and severe heart failure.
Heart sounds were recorded for up to 15 seconds at four standard auscultation sites (aortic, pulmonary, tricuspid, and mitral) using electronic stethoscopes in real clinical environments, complete with background noise. All participants underwent formal echocardiography, with valve severity graded per British Society of Echocardiography guidance.
The algorithm is built on a PhysioNet challenge-winning foundation and employs a recurrent neural network with transfer learning. It was initially pre-trained on open-access murmur recordings and subsequently fine-tuned using the echocardiography-labeled dataset. Separate models were created for each auscultation site, and the highest predicted probability across sites determined the final classification.
For comparison, GPs reviewed the same recordings through an online platform (with only the patient’s sex revealed) and predicted clinically significant VHD. Statistical assessment included McNemar’s test and bootstrap methods.
Algorithm Performance and Comparative Evaluation
The final dataset included 6479 recordings. That included over 25 hours of audio from 1767 patients (48 % female, median age 74). Of these, 1504 patients were used for training and 263 for the independent test set. Clinically significant VHD (≥mild stenosis or moderate regurgitation) was present in 793 patients (45 %), with aortic stenosis (AS) and mitral regurgitation (MR) being the most common lesions.
The AI system achieved an AUROC of 0.83 for detecting clinically significant VHD. At the predefined operating threshold, sensitivity was 72 % and specificity 82 %, with strong calibration (expected calibration error 0.08).
Performance for severe disease stood out: 98 % sensitivity for severe AS and 94 % for severe MR. Sensitivity for moderate AS was also high at 89 %, while moderate MR detection was more modest (75 % in testing, 50 % in cross-validation). Site-level analysis showed the tricuspid position contributed most to overall performance, though combining aortic and mitral sites offered the widest benefit.
Against 14 GPs listening to the same recordings, the algorithm showed clear advantages in both sensitivity and specificity (72 % vs 62 % and 82 % vs 64 %, respectively). It outperformed 13 of 14 GPs on the Youden Index, and importantly, displayed more consistent performance than the wide variation seen among clinicians.
Insights and Discussion
This work demonstrated that machine learning can accurately detect clinically significant VHD using short, non-invasive stethoscope recordings, supported by one of the largest phonocardiogram datasets with echocardiographic confirmation to date. Training directly against echo labels allowed the model to identify acoustic cues beyond human perception, contributing to its strong performance, especially in severe AS and MR.
The comparison with GPs, though limited by headphone-based remote listening, still highlighted how AI can supplement traditional auscultation in settings where diagnostic resources are limited.
Because the dataset intentionally included a higher proportion of hospital patients to ensure adequate numbers of clinically significant cases, real-world prevalence would be lower, which may affect predictive value and sensitivity. Larger primary care studies will be important to confirm generalisability in routine practice.
Conclusion
This study presents an AI system capable of detecting clinically significant VHD from electronic stethoscope recordings with near-clinical precision. By training on echocardiographic labels rather than murmur annotations, the model identifies acoustic features beyond human hearing, achieving 98 % sensitivity for severe aortic stenosis and 9 4% for severe mitral regurgitation and outperforming GP auscultation.
While detection for regurgitant lesions such as AR and TR was lower, the system provides a rapid, low-cost, scalable screening option that could support earlier diagnosis and more timely referral, particularly in overstretched primary care environments.
Further real-world validation will be essential, but this approach offers a practical bridge between limited echocardiography capacity and the low sensitivity of traditional auscultation.
Journal Reference
McDonald et al. (2026). Development and validation of AI-Enhanced auscultation for valvular heart disease screening through a multi-centre study. Npj Cardiovascular Health, 3(1), 5-. DOI:10.1038/s44325-026-00103-y. https://www.nature.com/articles/s44325-026-00103-y
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.