Use data to drive decisions

The M.S. in Data Science (MSDS) offers an 11-month integrated curriculum that features an interdisciplinary curriculum and practical, hands-on learning projects. Designed outside the traditional curricular structure, the MSDS is a forward-looking blueprint of the world in which data science plays an increasingly important role. 

MSDS curriculum is tightly prescribed, and students take a core set of courses throughout the year with two electives built into the schedule. Courses are interdisciplinary and taught by faculty at the School of Data Science and from across the University of Virginia. Several large data sets are used between courses to increase the program’s cohesion, and students work consistently in teams throughout, building strong relationships with their peers and faculty mentors.   

Curriculum

The MSDS program is designed around a spiral learning framework. Students begin by acquiring a foundation in languages, computation, and linear modeling. They then build upon those skills and explore Bayesian machine learning, statistics, data mining and text analytics, computer programming, and data ethics, as well as interdisciplinary electives from across the University. Throughout the program, students apply what they learn through hands-on group projects and practice effective communication skills. 

At the conclusion of the MSDS program, students address an important data science challenge through a sponsored team capstone project. Projects connect you with corporate and government partners who present unique challenges for you to tackle through hands-on learning. By applying the tools and techniques learned in the classroom, you gain real-world experience while providing the sponsoring organization valuable data-driven insights and solutions.

The 32-credit-hour M.S. in Data Science program is offered across three terms: Summer, Fall, and Spring. Students take a core set of courses with elective courses offered during the spring. For course details and descriptions by term, see below.  

Summer Term

9 credit hours

DS 5100: Programming for Data Science (3 credits)
An introduction to essential programming concepts, structures, and techniques. Students will gain confidence in not only reading code, but learning what it means to write good quality code. Additionally, essential and complementary topics are taught, such as testing and debugging, exception handling, and an introduction to visualization. This course is project based, consisting of a semester project and final project presentations.

DS 6001: Practice and Application of Data Science (3 credits)
This course covers data science practice, including communication, exploratory data analysis, and visualization. Also covered are the selection of algorithms to suit the problem to be solved, user needs, and data. Case studies will explore the impact of data science across different domains.

STAT 6021: Linear Models for Data Science (3 credits)
An introduction to linear statistical models in the context of data science. Topics include simple, multiple linear regression, logistic regression, and generalized linear models. The primary software is R. Data wrangling in R will also be covered.

Fall Term

12 credit hours 

CS 5012: Foundations of Computer Science (3 credits)
Provide a foundation in discrete mathematics, data structures, algorithmic design and implementation, computational complexity, parallel computing, and data integrity and consistency for non-CS, non-CpE students. Case studies and exercises will be drawn from real-world examples (e.g., bioinformatics, public health, marketing, and security).

DS 6030 Statistical Learning (3 credits)
This course covers fundamentals of data mining and machine learning with a common statistical framework. Topics include regression, classification, clustering,    describes approaches to turning data into information. Rather than the more typical deductive strategy of regularization, tree-based methods, ensembles, boosting, and Support Vector Machines. Coursework is conducted in the R programming language.

DS 6040: Bayesian Machine Learning (3 credits)
Bayesian inferential methods provide a foundation for machine learning under conditions of uncertainty. Bayesian machine learning techniques can help us to more effectively address the limits to our understanding of world problems. This class covers the major related techniques, including Bayesian inference, conjugate prior probabilities, naive Bayes classifiers, expectation maximization, Markov chain monte carlo, and variational inference.

DS 6002: Ethics of Big Data (2 credits)
This course examines the ethical issues arising around big data and provides frameworks, context, concepts, and theories to help students think through and deal with the issues as they encounter them in their professional lives.

DS 6011: Data Science Capstone Project Work I (1 credit)
This course is designed for capstone project teams to meet in groups, with advisors, and with clients to advance work on their projects. 

Spring Term

11 credit hours

DS 6050 Deep Learning (3 credits)
A graduate-level course on deep learning fundamentals and techniques and applications with emphasis on their broad applicability to problems across a range of disciplines. Topics include: regularization, optimization, convolutional networks, sequence modeling, generative learning, instance-based learning, and deep reinforcement learning. Students are required to have sufficient computational background to complete several substantive programming assignments.

DS 6013: Data Science Capstone Project Work II (2 credits)
This course is designed for capstone project teams to meet in groups, with advisors, and with clients to advance work on their projects.

Elective 1: 5000-level or higher (3 credit hours)*

Elective 2: 5000-level or higher (3 credit hours)*

Sample Electives

Students select their Spring Term elective courses in consultation with the Program Director. A variety of electives are available, including but not limited to those listed in the Graduate Record. Students are required to take a minimum of 6 total credit hours of elective courses. Elective courses must be at the 5000 level or higher to count toward the MSDS program unless otherwise preapproved. Examples of electives:

  • CS 6160: Theory of Computation
  • CS 6444: Parallel Computing
  • CS 6501: Special Topics in Computer Science (e.g.: Text Mining, Cloud Computing, Defense Against the Dark Arts, Vision & Language)
  • CS 6750: Database Systems
  • ECON 8720: Time Series Econometrics
  • ECON 7720: Econometrics II
  • EVSC 7070: Advanced Use of Geographical Information Systems
  • GCOM 7240: Advanced Quantitative Analysis
  • PHS 5705: Recent Advances in Public Health Genomics
  • PHS 7310: Clinical Trials Methodology
  • PSYC 5720: Fundamentals of Item Response Theory
  • PSYC 7760: Introduction to Applied Multivariate Methods
  • SARC 5400: Data Visualization
  • STAT 6250: Longitudinal Data Analysis
  • STAT 6260: Categorical Data Analysis
  • SYS 6023: Cognitive Systems Engineering
  • SYS 6050: Risk Analysis
  • SYS 6582: Selected Topics in Systems Engineering (e.g.: Reinforcement Learning, User Experience Design, Sensors & Perception)
  • SYS 7001: System and Decision Sciences

Availability of electives varies by year, and courses must be approved by the School of Data Science. Students interested in taking more than 6 credit hours of electives will need to obtain faculty approval.

The Graduate Record represents the official repository for academic program requirements.

Alumni Testimonial

Arishya Ansari headshot

“During my internship with Apple, I did a lot of data engineering and data science. I was able to work with a lot of the technologies that I learned in the MSDS program and am excited to take what I learned and apply it to my work.” — Arishya Ansari, MSDS 2021, Machine Learning Engineer, Apple (San Francisco, California)

View All