Machine learning in healthcare

From SI410
Jump to: navigation, search

Machine learning in healthcare is the use of machine learning techniques, algorithms, and models to improve the quality and efficiency of healthcare services. Machine learning can greatly reduce the time and effort it takes to perform simple tasks in healthcare that have the ability to be automated, such as collecting and analyzing data. Many technical challenges that machine learning models face are exploited by the nature of healthcare, such as being able to generalize well to unseen data. There are also ethical dilemmas that are evident in the use of machine learning in healthcare, involving the confidentiality of patient records, collection of private data, inherent bias of machine learning models, and interpretability of machine learning models.

What is Machine Learning?

Machine learning is a field of computer science in which computational machines perform tasks without being given a set of instructions regarding how to perform that task. These machines are represented as models or algorithms with certain attributes to describe them. These attributes include some performance metric in which the model is judged from, a knowledge base which includes everything the model knows or has seen before, and a set of tasks that the model is performing. In machine learning, a machine is said to learn if its performance with respect to this metric has improved.[1]In this sense, there is some notion of what models are better than others when solving particular problems. The primary goal of any machine learning algorithm or model is to be able to analyze some initial set of data, grow its knowledge base according to this analysis, and be able to make accurate and relevant predictions about unseen data in the future. There are many different types of learning which are designed to target different types of problems. These include:

Problems that tend to be ideal candidates for machine learning are regression or classification problems in which the goal is to observe and identify trends and patterns in data and be able to extend these observations to unseen data. Some examples of this may include:

  • Using a camera-based agent to distinguish apples from oranges
  • Predicting the trajectory of a company's sales given the sales patterns over the past few years
  • Even predicting whether someone has a certain disease based on what symptoms they are exhibiting or certain family history

Applications of Machine Learning in Healthcare

As of recently with many new technological advances and new findings in the field of computer science, machine learning is becoming particularly relevant in healthcare. Technology excels at identifying patterns and behaviors with information that it takes in, so this is very helpful in applications such as identifying the presence of certain diseases, creating documentation on patient records, providing medical information through chat-bots, and even performing surgeries.[2] One area that would benefit from automation drastically is the collection of patient data and generation of electronic health records (EHR). An EHR is a "digital version of a patient's paper chart. EHRs are real-time, patient centered records that make information available instantly and securely to authorized users." EHRs help medical personnel to keep track of patients' data over time, know which patients need to come in for checkups, and easily monitor the overall health of patients.[3] Machine learning techniques and models could be especially useful for detecting diseases and health patterns in patients by analyzing EHRs. Recently, a machine learning algorithm was used to develop a model which took in around 1.7 million EHR records and predicted the future suicidal behaviors of patients. The model was tested on unseen data and performed accurately (35%-49% sensitivity (true positive classification rate) and 90-95% specificity (true negative classification rate)).[4]

Technical Challenges of Using Machine Learning in Healthcare

Robustness of Models

Since healthcare services are not standardized across the world, machine learning models will have to be able to be transferable across different locations. This involves machine learning models being trained on data which is representative of the general public and is able to capture features and patterns of many different varieties. Machine learning models need to be reliable and robust in order to be effective and able to be applicable in different scenarios.[5]

Accessibility of Data

Machine learning models need to have a rich source of data to train on so that it is accurate when deployed on unseen data. Machine learning models may have issues having access to this data if much of it is withheld due to privacy and confidentiality agreements.

Ethical Issues

Use of Personal Data

Using machine learning practices in healthcare raises concerns for the privacy of patients whose data is being used in these machine learning models. Companies and professionals working on the development of machine learning in healthcare need to have open access to adequate and relevant data. Given that relevant data is obtained from patients themselves, this data needs to be maintained appropriately. However, certain data brokers that are working with machine learning in the realm of healthcare do not have the same legal obligations that health care providers do. They do not have to ask for the permission of the patients whose data they are working with to use and share their private information. [6]

Lack of Transparency

One concept relevant to machine learning models is model interpretability. Model interpretability is the extent to which any user of the model is able to understand how the model works and what exactly it is doing. A model with little or no interpretability is called a black box. You do not know what is going on inside of it, you just see the input and output. In general, high model interpretability is favored over low model interpretability so that there is no uncertainty with how the model is interacting with the input data and how it is determining what its output should be. One case in which low model interpretability was problematic in healthcare arose in the 1990s.[7] There was a push to have more algorithms to decide whether pneumonia patients should be kept at the hospital. However, the models that were being trained were predicting that patients who had asthma as well as pneumonia were healthy and should not be admitted. The models were not performing how they were intended to, and instead found a correlation that pneumonia patients with asthma are treated more urgently and thus have better outcomes. This highlights the problem of using machine learning algorithms as black boxes as it is possible that some patients could have been incorrectly handled. This means that doctors who are using these machine learning models and algorithms should have a working knowledge of exactly how these models work and what they are doing so that they can be fully aware of how it may impact patients and their well-being.[8]

Bias of Models

Machine learning models have no concept of right and wrong. They only perform the tasks that they were designed to do, unapologetically. The idea of a machine learning model picking up on particular patterns in the data that may not have been intentional is known as bias. Bias may come from having a model that is not complex enough to learn well enough, or may come from having a data set that is not completely representative of a population. In healthcare, this is problematic because we are putting people's well-being in the hands of technology that is very sensitive to its inputs. Humans may not be able to completely control or predict the behavior of these technologies. Recent studies have shown that there are racial disparities in data for cancer outcomes between African Americans and whites. Also, African Americans underwent less therapy treatment than whites and had a higher mortality rate.[9] This type of data is problematic for machine learning models in terms of bias because there is an inherent misrepresentation in the data presented which would lead to machine learning models increasing the racial disparities.

Patient Confidentiality

With the necessity of doctors to need to know how to use machine learning models in health care, this may necessitate the inclusion of professionals in the healthcare system specifically designated to work with the machine learning models. This means that the dynamics of patient-doctor relationships will change as well. These professionals handling the machine learning models and algorithms will need to know sensitive details about patients in order to ensure that they are completely aware of how their models are working. [10]


  1. Machine Learning. Retrieved March 13, 2020.
  2. Polachowska, Kaja. 5 medical challenges that can be solved with AI in healthcare, 28 Feb 2019. Retrieved March 13, 2020.
  3. What is an electronic health record (EHR)? Retrieved March 27, 2020.
  4. Adkins, Daniel E..Machine learning and electronic health records: A paradigm shift, 1 Feb 2017. Retrieved March 27, 2020.
  5. Ghassemi, M., Naumann, T., Schulam, P., Beam, A., Chen, I., & Rajesh Ranganath. "A Review of Challenges and Opportunities in Machine Learning for Health", 5 Dec 2019. Retrieved March 27, 2020.
  6. Hoffman, Sharona. Artificial intelligence in medicine raises legal and ethical concerns, 4 Sep 2019. Retrieved March 13, 2020.
  7. Mbadiwe, Tafari. The Potential Pitfalls of Machine Learning Algorithms in Medicine, 12 Dec 2017. Retrieved March 13, 2020.
  8. Vayena, E., Blassime, A., & I. Glenn Cohen. Machine learning in medicine: Addressing ethical challenges, 6 Nov 2018. Retrieved March 13, 2020.
  9. Aizer, A., Wilhite, T., Chen, M., Graham, P., Choueiri, T., Hoffman, K., Martin, N., Trinh, Q., Hu, J., & Paul L. Nguyen. Lack of reduction in racial disparities in cancer-specific mortality of a 20-year period, 22 Feb 2014. Retrieved March 13, 2020.
  10. Hannon, Patricia. Research says use of artificial intelligence in medicine raises ethical questions, 14 Mar 2018. Retrieved March 13, 2020.