A recent review indicates that even now humans are better compared to technology when the accuracy of detecting possible cases of breast cancer during screening is concerned.
The study was recently reported in the journal BMJ.
According to the researchers, there is a lack of good quality evidence to back the policy of substituting human radiologists with artificial intelligence (AI) technology while screening for breast cancer.
On a global level, breast cancer is a major cause of death among women, and several countries have started mammography screening programs to detect and treat it at an early stage. However, studying mammograms for identifying early signs of cancer is a high-volume cyclical work for radiologists, and some cancers tend to get missed.
Earlier studies have indicated that AI systems tend to surpass humans and may soon be used in the place of experienced radiologists. Still, a recent review of 23 studies stressed evidence gaps and concerns regarding the methods that have been utilized.
To deal with this uncertainty, the UK National Screening Committee appointed a research group from the University of Warwick to analyze the accuracy of AI for breast cancer detection in mammography screening practice.
The scientists reviewed 12 studies performed from 2010 comprising data for 131,822 screened women in the United States, Sweden, Spain, the Netherlands and Germany. As far as the 12 studies were concerned, the quality of the methods utilized was noticed to be poor and their relevance to European or UK breast cancer screening programs was low.
Three huge studies involving 79,910 women made a comparison of AI systems with the clinical decisions of the original radiologist. Among these, 1,878 had screened detected cancer or interval cancer (cancer diagnosed in-between routine screening appointments) within one year of screening.
The majority, that is, 34 out of 36 or 94%, of AI systems assessed in these three studies were less accurate compared to a single radiologist, and all were less precise than the consensus of two or more radiologists, which is the standard practice in Europe.
On the other hand, five smaller studies comprising 1,086 women reported that all of the AI systems assessed were highly precise compared to a single radiologist. However, the scientists noticed that these studies were at a greater risk of bias and their potential results are not replicated in bigger studies.
In three studies, AI was employed as a pre-screen to triage which mammograms need to be reviewed by a radiologist and which do not screen out 53%, 45% and 50% of women at low risk but also 10%, 4% and 0% of cancers detected by radiologists.
The researchers denote certain study limitations like the exclusion of non-English studies that might have had relevant proof, and they add that AI algorithms are short-lived and constantly evolving. Therefore, reported evaluations of AI systems might be outdated by the time of study publication.
However, the use of strict study inclusion criteria along with a rigorous and systematic assessment of study quality indicates the conclusions are strong.
As such, they say: “Current evidence on the use of AI systems in breast cancer screening is a long way from having the quality and quantity required for its implementation into clinical practice.”
They add: “Well designed comparative test accuracy studies, randomized controlled trials, and cohort studies in large screening populations are needed which evaluate commercially available AI systems in combination with radiologists in clinical practice.”
Freeman, K., et al. (2021) Use of artificial intelligence for image analysis in breast cancer screening programs: systematic review of test accuracy. BMJ. doi.org/10.1136/bmj.n1872.