Big Data analytics for personalized medicine

From SI410
Jump to: navigation, search
The "one size fits all approach" to Precision Medicine depicted by Dr. Jody Barbaeu [1].

Big Data in Personalized Medicine (Precision Medicine) is the biomedical research of using an individual’s medical, genetic, behavioral, and environmental information to help personalize their healthcare [2]. Before the notion of personalized medicine, the "one size fits all" approach was dominant in the healthcare field [3]. Big data in health and biomedical research provides a large-scale and multi-dimensional collection, analysis, and interpretation of information used for preventive, diagnostic, and therapeutic applications [4]. This medical model allows for a precise and customized healthcare to patients by connecting together large amounts of information from these diverse datasets and revealing certain correlations. While this is a developed concept, the computational complexity of health-related big data reveals various ethical issues in relation to the anonymity of information, demographic implications, and genetic discrimination [3].


The idea of using big data analytics in medicine has been around for a while. One of the first times differences in patient’s biology was recognized was during the identification of the ABO blood group system in 1901 by Karl Landsteiner. In 2003, the human genome was completely sequenced and it facilitated more research on personalized medicine. On December 18th, 2015, President Barack Obama signed the Precision Medicine Initiative which brought light to the idea of delivering the right treatments to the right person and at the right time [5]. This initiative shared data that generated more than ten years of medical records, personal data, and various different digital health technologies. This was made to develop a stronger understanding of diseases and pathogenesis [6].

The increase in massive data collections can be represented by many movements such as Global Alliance for Genomics and Health, ELIXIR, and Big Data to Knowledge, International Cancer Genome Consortium (ICGC), the International Human Epigenome Consortium (IHEC), and the International Rare Disease Consortium (IRDiRC), and many more [2].

Personalized Medicine Stakeholders

The Precision Medicine ecosystem developed by the Roundtable on Translating Genomic-Based Research for Health [7].

There are various stakeholders that play a role within the precision medicine ecosystem: patients, providers, clinical laboratories, and researchers. With a patient’s consent, researchers can generate large amounts of data about family history and environmental exposures. Clinicians can then use this knowledge from the clinical laboratories and the growing amount of data disseminates to other systems. The precision medicine ecosystem can be extended to governments who sponsor the research and products, industries which play a role as partners in development and commercialization of products, and professional societies which educate and train future researchers, providers, and policy analysts. The stakeholders all work together in shaping the precision medicine data and how to use it [6].

Big Data Applications

Timeline of Precision Medicine applications across an individual's life from the 2016 World Innovation Summit for Health [6]

Big data are applied to various areas within healthcare such as, drug and biomarker development and research in cancer, rare diseases, neurodegeneration, diabetes, and cardiovascular pathologies, and many more fields [2].

The applications of big data analytics in personalized medicine can be grouped into three categories [3].

  • Basic Research - discovering molecular targets for new therapies and the discovery of biomarkers
  • Clinical Research - clinical testing for targeted therapies, diagnostic techniques, and predictive tests
  • Clinical Practice - diagnosing individuals and developing preventive care with more precision of disease onset

Personalized medicine is used at various points in a person’s life. For example, genetic screening used before conception of a child can predict the risk of passing on a disease to the offspring. During pregnancy, a mother can receive genetic testing to check for any chromosomal abnormalities in the fetus and even obtain whole genome sequencing. During birth, genetic sequencing can be used to diagnose any critical conditions which can then be assessed and treated for immediately. Throughout an individual’s life, big data analysis is used for diagnosis and treatment of various diseases [6].

Data Types

Data within healthcare increases in complexity and magnitude daily. Neuroimaging produces more than ten petabytes of data every year, allowing data to increase by nine times within the last three decades. Genomic data is set to reach the quintillion dimensions in the next decade. Imaging data is considered one of the largest data types in healthcare. It covers gigapixel images through displaying organisms and tissues in subcellular resolutions and quantitative measurements using metadata. Another key data type is electronic health records, used to hold patient information and to relay information between clinicians [2].

Genome sequencing, transcriptome sequencing, proteome profiling and interactome profiling are also big data technologies that contribute to precision medicine and healthcare in general. Single-cell genome and transcriptome sequencing, circulating tumor DNA (ctDNA), identification through liquid biopsy, and sequencing of bacterial genomes also contribute to this [2].

Ethical Dilemmas

The use of big data within healthcare opens up many challenges and dilemmas.

  • Exclusion of Communities - There demographic gaps within health data sets and in turn, excluded groups have less of an advantage of receiving proper personalized medicine, or any at all [3].This is because there is less information and evidence available to develop personalized treatments to these neglected groups.
  • Anonymization - It is difficult to develop data sets with full anonymity especially within genome analysis since every genome is unique to its individual [3].The ethical notion of privacy and confidentiality within healthcare research still continues to raise concerns.
  • Demographic Implications - Big data analytics produce information that can be misinterpreted and reinforce many ethnic and racial stereotypes.

The ethical challenges within big data bring light to various issues within personalized medicine. Privacy, personal autonomy, heterogeneity, proper data protection and storage, and the public demand for trust, fairness, and transparency with the usage of big data in healthcare, are all dilemmas worth noting [4].

Six Ethical Values to Precision Medicine and Big Data

To address the ethical issues surrounding precision medicine and big data analytics, six values are highlighted.

  • 1. Harm minimization - These initiatives work to limit the harms from genetics discrimination and group repetitional effects through efforts such as genetic anti-discrimination and data access limitations and encryption systems [4].
  • 2. Justice - Lack of access because of high cost, lack of inequitable and inclusion prioritization, and social and power disparities, are all forms of injustice seen in precision medicine [4].
  • 3. Public benefit - Precision medicine can improve patient’s care through developing targeted therapies, preventive screening tools, and personalized pharmaceutical prescriptions. Big data in precision medicine benefits the public by accelerating the process and ability of researchers to make insights [4].
  • 4. Transparency - Since precision medicine is not entirely anonymous, trust is only earned if systems such as governance and protection, operate, manage and distribute individual’s data with transparency [4].
  • 5. Engagement - With precision medicine initiatives, engaging with the public is important to provide legitimacy. It also further supports a responsible development of precision medicine through a socially acceptable lens [4].
  • 6. Reflexivity - Initiatives must be aware of their ethical implications and that these implications will shift over time. Therefore, a reflexive approach reveals that ethical issues within precision medicine can change depending on the context. Being able to constantly adapt and revisit decisions is important within the ever-changing field [4].


  1. Barbeau, Jody. PDX and Personalized Medicine, 5 June. 2019.
  2. 2.0 2.1 2.2 2.3 2.4 Cirillo, Davide, and Alfonso Valencia. Big Data Analytics for Personalized Medicine, 6 Apr. 2019.
  3. 3.0 3.1 3.2 3.3 3.4 . Sun, Shirley, et al. Precision Medicine and Big Data, 1 Jan. 2019.
  4. 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 Ienca, Marcello, et al.Considerations for Ethics Review of Big Data Health Research: A Scoping Review, 11 Oct. 2018.
  5. .Harold on History: The Evolution of Personalized Medicine
  6. 6.0 6.1 6.2 6.3 .Ginsburg, Geoffrey S, and Kathryn A Phillips.Precision Medicine: From Science To Value, May 2018.
  7. Roundtable on Translating Genomic-Based Research for Health, et al. Genomics-Enabled Learning Health Care Systems: Gathering and Using Genomic Information to Improve Patient Care and Research. National Academies Press (US), [1], 8 July 2015.