3 credit hours, Spring, Summer. Pre-requisite(s): None.
Students are expected to have a working familiarity with the discipline of data science and analytics and general knowledge about the impacts of Big Data in businesses and corporations. All students should have a working knowledge of all aspects of Microsoft Office; and it goes without saying that they should be familiar with Internet access and usage.
- Introduction to Python. Python syntax to write basic computer programs; Using the interpreter; Built-in and user-defined functions; Introduction to object-oriented programming in Python.
- Introduction to R. Simple graphing; R Basics: variables, strings, vectors; Data Structures: arrays, matrices, lists, dataframes; Programming Fundamentals: conditions and loops, functions, objects and classes, debugging.
- Introduction to SAS Programming. The SAS Operating Environment; SAS Programming Essentials: SAS Program Structure, SAS Program Syntax; Getting Data In and Out of SAS; Printing and Displaying Data; Introduction to SAS Graphics.
3 credit hours, Spring & Summer. Pre-requisite(s): None.
3 credit hours, Fall, Spring. Pre-requisite(s): Elementary Statistics.
Principles of biostatistics and the analysis of clinical and epidemiological data. Descriptions and derivations of statistical methods as well as demonstrations of these methods using SAS. Topics include basic analysis methods, elementary concepts, statistical models and applications of probability, commonly used sampling distributions, parametric and nonparametric one and two sample tests, confidence intervals, applications of analysis of two-way contingency table data, simple linear regression, and simple analysis of variance.
The concepts and structures used to store, analyze, manage, and present (visualize) information and navigation using Python, SQL, SAS, and QGIS. Topics will include information analysis and organizational methods, and metadata concepts and applications. Students will be assisted to identify disparate data sources needed to perform analysis for a given real-world problem. Typically, data from a single source will not be adequate to perform the required analysis. Students will pull data from the disparate data sources and import it into SAS, and use several SAS procedures to detect invalid data; format, validate, clean the data; and impute the data if it is missing. This will prepare the data for statistical analysis and decision modeling in SAS. Python Lists, Sets, Strings, Tuples, and Dictionaries; Reading and manipulating CSV files, and the Numpy library; Introduction to the abstraction of the Series, Pandas, and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as Groupby, merge, and pivot tables effectively.
- Introduction to Databases and basic SQL; Using string patterns and ranges to search data and to sort and group data in result sets; Working with multiple tables in a relational database using join operations; Using Python to connect to databases and then create tables, load data, query data using SQL, and analyze data using Python.
- Introduction to Data Step in SAS; Processing Data in Groups; Manipulating Data with Functions; Data Extraction and Preparation, Concatenating, Merging and Interleaving Tables; Using SQL in SAS to query and join tables.
- Preparing comprehensive plans to manage spatial and non-spatial health-related data; building versioned enterprise databases; and knowing how to implement best practices for managing databases for health projects and organizations.
Principles of biostatistics focusing on statistical modeling approaches to the analysis of continuous, categorical, and survival data. Regression modeling including the links between regression and analysis of variance (parameterization), multiple regression, indicator variables, use of contrasts, multiple comparison procedures and regression diagnostics. The course will generalize these modeling concepts to different types of outcome data including categorical outcomes (i.e., logistic and log-linear modeling) and survival outcomes (i.e., proportional hazards analysis). Students are taught to conduct the relevant analysis using SAS and R.
3 credit hours, Fall, Summer. Pre-requisite(s): MSDS 510, (MSDS 520 or MSBDS 520).
Deep dive into recent advances in AI in healthcare, focusing on deep learning approaches for healthcare problems. Foundations of neural networks. Cutting-edge deep learning models in the context of a variety of healthcare data including image, text, multimodal and time-series data. Advanced topics on open challenges of integrating AI in a societal application such as healthcare, including interpretability, robustness, privacy and fairness.
3 credit hours, Fall, Spring. Pre-requisite(s): (MSDS 530 or MSBDS 530), (MSDS 535 or MSBDS 540).
Introduction to machine learning with biomedical applications. Survey of machine learning techniques, including traditional statistical methods, resampling techniques, model selection and regularization, tree-based methods, principal components analysis, cluster analysis, artificial neural networks, and deep learning. Students implement machine learning models with open-source software for data science. They explore data and learn from data, finding underlying patterns useful for data reduction, feature analysis, prediction, and classification.
3 credit hours, Fall, Spring. Pre-requisite(s): MSDS 520 or MSBDS 520.
Data visualization tools and technologies (including SAS Visual Analytics, R and ggplot2, Tableau) essential to analyze massive disparate amounts of information and make data-driven decisions.
3 credit hours, Spring & Summer. Pre-requisite(s): MSDS 530 or MSBDS 530.
Analysis of ethical issues, algorithmic challenges, and policy decisions (and social implications of these decisions) that arise when addressing real-world problems through the lens of data science, and the choices we make at the different stages of the data analysis pipeline, from data collection and storage to understand feedback loops in analysis.
3 credit hours, Spring, Summer. Pre-requisite(s): MSDS 565, 575.
The research process investigating information needs, creation, organization, flow, retrieval, and use. Stages include: research definition, question, objectives, data collection and management, data analysis and data interpretation. Techniques include: observation, interviews, questionnaires, and transaction-log analysis.
3 credit hours, Fall, Spring. Pre-requisite(s): MSDS 580.
Comprehensive real-life industry-type capstone, oriented toward the student’s domain of interest. Projects will include: formulation of a question to be answered by the data; collection, cleaning and processing of data; choosing and applying a suitable model and/or analytic method to the problem; and communicating the results to a non-technical audience.
Students choose one of two Concentration Tracks, each comprising 3 Courses (9 credit hours):
Precision Medicine Informatics Concentration Track:
3 credit hours, Fall, Spring. Pre-requisite(s): MSDS 525, (MSDS 530 or MSBDS 530).
This course will focus on the inherent translational informatics challenges, concerns, and opportunities afforded by precision medicine to provide a more accurate, personalized characterization of patient populations based on various characteristics including molecular (e.g., genomic, proteomic), clinical (e.g., comorbidities), environmental exposures, lifestyle, patient preferences and other information. Informatics is a necessary component to tackle precision medicine. This includes managing Big data, advanced concepts of a huge variety of genomic sequencing datasets emerging in the post-genomic era from several sequencing platforms, creating learning systems for knowledge generation, providing access for individual involvement, and ultimately supporting the optimal delivery of precision treatments derived from translational research.
3 credit hours, Fall, Summer. Pre-requisite(s): MSDS 525.
Introduction to systems development for computational science. Design, develop, and deploy a set of software components to produce a scalable, reliable, and reproducible experimental system for scientific investigation; Use a variety of approaches to software development team organization, and select techniques that are appropriate in different circumstances.
3 credit hours, Fall, Spring. Pre-requisite(s): MSDS 550 or MSBDS 550.
Tools and techniques for building statistical or machine learning models to make predictions based on data. NLP and Text Analytics, Time Series, Experimentation and Optimization.
Population Health Informatics Concentration Track:
3 credit hours, Fall, Spring. Pre-requisite(s): MSDS 525.
Uses a problem- or inquiry-based learning approach where students will take the lead in designing and implementing data- and technology-driven projects that generate data analytics-based solutions for complex public health issues and develop useful data-driven decision strategies. Students will also reproduce and replicate several case studies that illustrate the power of technologies like GIS, GPS, drones, spatial narratives, and web visualization in population health. They will examine the impact of technology on population health informatics and vice versa.
3 credit hours, Fall, Summer. Pre-requisite(s): MSDS 525.
Exposes students to foundational GIS concepts, tools and methods relevant to the health sector. Special attention is given to acquiring, organizing, integrating, analyzing and visualizing location-based health data with the aid of closed- and open-source GIS software. Students will develop practical competencies in applying GIS to achieve several goals and purposes including understanding and solving spatio-temporal population health problems in ways that are socially and ethically appropriate.
3 credit hours, Fall, Spring. Pre-requisite(s): MSBDS 545 or MSDS 550 or MSBDS 550.