How Machine Learning Is Transforming Disease Diagnosis

health and medicine research area of practice card

One of the biggest challenges in providing patients with high-quality health care is making the right diagnosis at the right time. Since time is precious for these patients, a tool that can detect patterns that human doctors might overlook could be a game changer.
 
The health care industry generates copious amounts of data with information on everything from patient demographics, treatments, insurance, clinical reports, and more. "Big data" like this holds great value, and when used ethically, machine learning (ML) techniques employed on these datasets can revolutionize how we diagnose patients by detecting disease earlier and selecting the best treatment plan accordingly. Working with big data calls for strong computing resources and ML approaches that can be used to identify significant patterns and regressions within the data, a necessity for disease prediction and clinical action.
 
With a timely diagnosis determined via ML pattern recognition, a patient can receive early and effective intervention and may have higher chances of recovery. This is life changing! In recent years, these kinds of tools have been tested on real health data. Samin Poudel, a data scientist at Citibank, tested and trained 20 ML algorithms on a dataset related to diabetes to evaluate the accuracy of the algorithms' diagnoses.1 Some algorithms are more effective for particular diseases and datasets than others, so the tool's ability to detect the optimal algorithm is valuable. The study found that the most state-of-the-art ML method diagnosed diabetes with 77% accuracy, a promising statistic that displays the ability of the model to learn what a diabetes diagnosis looks like. 
 
Medical images are another important form of health data, so it is crucial that ML algorithms be able to analyze this kind of input for the detection of cancer, neurological disorders, and infectious diseases. Scott Mayer McKinney, a technical staff member at OpenAI, and his colleagues analyzed mammographic images with hopes to identify early indicators of tumor malignancy.2 This system's performance surpassed that of human experts in breast cancer prediction, a very exciting finding that could mean cancer patients could begin receiving treatment before their tumor progresses significantly!
 
Before these ML techniques can be employed in clinical settings, the accuracy rates must be much higher, and more testing must be performed. However, these studies, among many others, contribute to a promising foundation for future steps toward diagnosing patients in real time. When working with biological data, the quality of that data must be taken into consideration. Not all records are kept with the highest precision or organization, and when ML models are trained on poor quality data, they will yield poor quality predictions. For generalizable disease diagnoses, algorithms must be trained on data that is balanced to represent diverse patient populations, prevent bias, and ensure accurate diagnostic decisions for all patients — not just the patients who match those in the training data. Health data also contains highly sensitive information about patients, so the security and ethical analysis of this kind of data is essential. 
 
ML will not replace clinicians, but it gives them powerful new tools to detect diseases earlier, treat patients sooner, and ultimately improve outcomes.


Author

Emily Garman

MSDS Residential Student Ambassador

[1] Poudel, S. (2022). A Study of Disease Diagnosis Using Machine Learning. Medical Sciences Forum, 10(1), 8. https://doi.org/10.3390/IECH2022-12311

[2] McKinney, S.M., Sieniek, M., Godbole, V. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020). https://doi.org/10.1038/s41586-019-1799-6