How data science improves health care

improving health care through data science

From improving clinical care to addressing health disparity, data science helps address complex health care problems.

The health care industry includes a massive amount of data. The Electronic Health Records (EHR), genome sequencing, mobile health devices, social media, and other health records sources create health care’s big data. Additionally, advancements in image analysis and language processing manipulate data at a fast pace and lead to better clinical care.

Drawing meaningful insights from this data requires data management, informatics, statistics, machine learning, and other applications of data science in health care. Findings can lead to new insights and a deeper understanding of the human diseases. Therefore, the potential to improve public health, clinical outcomes, treat complex diseases and speed up the development of lifesaving drugs is possible through the applications of health care data science.

The following areas highlight how data science is leading to innovations in the health care industry.

The following areas highlight how data science is leading to innovations in the health care industry.

Computer Vision

Computer vision is an application of artificial intelligence to train computers to see images like a human. It acts as a second eye to improve the accuracy of clinical decisions. Meharry’s Biomedical Imaging Analysis Lab ties computer vision with state-of-the-art machine learning and deep learning tools to help with research and clinical decision support.

Computer vision has made numerous positive contributions to health care. It helps physicians to more accurately measure blood loss during childbirth, predict heart rhythm disorders, and focus on patient care rather than time-consuming tasks.

Natural language processing in health

Natural language processing is when computers understand human speech and text. That same technology that detects spam and serves as a virtual personal assistant helps draw insights from unstructured patient data.

Natural language processing, combined with machine learning, enables physicians to gain actionable insight from health care data. The result means that vast text data drawn from clinical notes can provide a more focused look by creating phenotypes for patient groups.  For instance, NLP can help extract data from pathology reports that help explore the relationships between cancerous tissues and genetic mutations.

Health care organizations can also leverage speech recognition with natural language processing. Providers can dictate reports directly to a patient’s electronic health record, saving transcription expenses and making it available in real-time.

Electronic phenotyping and machine learning

Electronic phenotyping uses data to characterize an individual’s health status. It is an important step in the application of data science to health care. The Norwegian Centre for E-health Research list it’s valuable contributions as:

  • identify people with specific conditions
  • public health and safety surveillance
  • administrative purposes
  • clinical research studies
  • precision medicine (PatientsLikeMe)

With electronic phenotyping, vast electronic health record data becomes a powerful aid to physicians. This information can support clinical decisions or determine risk for a specific medical condition. For example, in Meharry’s Population Health and Disparity Research Lab, Dr. Aize Cao, associate professor of biomedical data science, is integrating EHR data and informatics to improve patient health outcomes.

population health lab illustration
Illustration of analysis at the Population Health Disparity Research Lab.

New drug target discovery

Data science has joined biology, chemistry and medicine as a critical component to pharmaceutical research. Nic Fleming writes in that machine learning and other methodologies should make drug discovery “quicker, cheaper and more effective.” 

Berg, a biotechnology company, uses artificial intelligence to identify potential treatments from biological causes of disease.

“We are turning the drug-discovery paradigm upside down by using patient-driven biology and data to derive more-predictive hypotheses, rather than the traditional trial-and-error approach,” says Berg co-founder, chief executive and SACS Advancement Council member Niven Narain in

Berg’s approach led to the discovery of a new cancer drug. They hope it will also provide therapies for diabetes, Parkinson’s disease and other conditions.


Epidemiology is a broad discipline that presents several opportunities for the application of data science. The field deals with the incidence, distribution and possible control of diseases and other factors relating to health. The emergence of big data has significant impact on epidemiology.

Stephen J Mooney, Daniel J Westreich, and Abdulrahman M El-Sayed write in the journal Epidemiology that “. . . Big Data has evolutionary and revolutionary implications for identifying and intervening on the determinants of population health. We suggest that as more sources of diverse data become publicly available, the ability to combine and refine these data to yield valid answers to epidemiologic questions will be invaluable.”

Public health policy

Harnessing big data can lead to lead to significant improvements for public health policy. Data science methodologies can measure aspects of public health at a granular level. Muin J. Khoury and  Michael Engelgau write in a blog on that “. . . big data could point to implementation gaps and disparities and accelerate the evaluation of implementation strategies to reach population groups in most need for interventions.”

Advancing discovery of public health and health inequities is a priority at Meharry. The Geographic Information Systems and Visualization Lab uses geospatial and visualization technologies to identify coverage gaps in population health and to develop solutions for improving health outcomes. Likewise, the Population Health Disparity Research Lab applies statistics and machine learning to improve our understanding of population health.


Genomics is the study of all of a person’s genes, or genome, and how they interact with each other and that person’s environment. The field is critical to understanding complex diseases like cancer, heart disease and diabetes. Data science and computational biology allow researchers to efficiently integrate genomic and environmental data for an in-depth analysis of diseases.

At Meharry’s Genomics Lab, faculty apply data science skills to develop novel next-generation sequencing methods to identify genomic aberrations that cause disease or drug resistance and use our expertise to serve Meharry and the broader scientific community.

Mobile Health (mHealth)

A more recent development in the application of data science to health care is mHealth where smartphones, wearable sensor devices and other patient monitoring tools are used to capture healthcare data passively. Leveraging this data from mHealth apps and digital devices lead to improvements in personalized patient care and detecting life-threatening disease. At Meharry’s mHealth Wearable Sensors Lab, Dr. Vibhuti Gupta, assistant professor of computer science and data science, is working to design mHealth systems that capture physiological, psychological, and behavioral data and develop novel methods, algorithms, and systems to analyze these data for in-time personalized health interventions. Dr. Gupta’s current projects are focusing on early detection of adverse clinical events in hematologic cancer patients and early, in-time diagnosis and prediction of COVID-19.

mhealth research illustration
A schematic illustration of a project on mHealth wearable sensor data analytics.


Burroughs, Amy. “Language Processing Tools Improve Care Delivery for Providers.” Health Tech Magazine. May 4, 2020. Access March 10, 2021.

Fleming, Nic. “How artificial intelligence is changing drug discovery.” May 30, 2018. Accessed March 8, 2021.

Health Catalyst Editors, “Healthcare NLP: The Secret to Unstructured Data’s Full Potential.” Health Catalyst. April 2, 2019. Accessed March 10, 2021.

Mooney, Stephen J., Westreich, Daniel J. and El-Sayed, Abdulrahman M. “Epidemiology in the Era of Big Data.” Epidemiology. May 26, 2015. Accessed March 8, 2021.

Muin J. Khour and David A. Chambers. “Can Big Data Science Deliver Precision Public Health?” July 23, 2019. Accessed March 8, 2020.

National Human Genome Research Institute. “Genetics vs. Genomics Fact Sheet.” September 7, 2018. Accessed March 10, 2020.

Norwegian Centre for E-health Research. “Exploring electronic phenotyping for clinical practice.” January 2020. Accessed March 10, 2021.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *