Use data to drive decisions

The M.S. in Data Science (MSDS) offers an 11-month integrated curriculum that features an interdisciplinary curriculum and practical, hands-on learning projects. Designed outside the traditional curricular structure, the MSDS is a forward-looking blueprint of the world in which data science plays an increasingly important role.  

MSDS curriculum is tightly prescribed, and students take a core set of courses throughout the year with two electives built into the schedule. Courses are interdisciplinary and taught by faculty at the School of Data Science and from across the University of Virginia. Several large data sets are used between courses to increase the program’s cohesion, and students work consistently in teams throughout, building strong relationships with their peers and faculty mentors.    

Curriculum

The MSDS program is designed around a spiral learning framework. Students begin by acquiring a foundation in languages, computation, and linear modeling. They then build upon those skills and explore Bayesian machine learning, statistics, data mining and text analytics, computer programming, and data ethics, as well as interdisciplinary electives from across the University. Throughout the program, students apply what they learn through hands-on group projects and practice effective communication skills. 

At the conclusion of the MSDS program, students have a choice on their final culminating experience.  They may choose to participate in a pre-approved, qualified internship over the summer and complete the required program credits in the following fall term.  Or, students can address an important data science challenge through a sponsored team capstone project and finish remaining program credits to graduate at the end of the summer term.  Projects connect you with corporate and government partners who present unique challenges for you to tackle through hands-on learning. By applying the tools and techniques learned in the classroom, you gain real-world experience while providing the sponsoring organization valuable data-driven insights and solutions. 

The 33-credit-hour M.S. in Data Science program is offered across three terms: Fall, Spring, and Summer. Students take a core set of courses with elective courses offered during the spring. For course details and descriptions by term, see below.  

International Students: This program is eligible for the OPT STEM extension.

Term 1, Fall

12 credits

  • DS 5111 Data Engineering I: Data Pipeline Architecture (3 credits)
    • Covers the essential environments and tools for data engineering. Topics include Linux, software development and testing, database design and construction, creation and deployment of containers, and data load/transform/extraction.
  • Understanding Uncertainty (3 credits)
  • Computation for Data Science (3 credits)
  • Machine Learning I: Introduction to Predictive Modeling (3 credits) 

Term 2, Spring

12 credits

  • DS 6002 Ethics of Big Data (3 credits)
    • This course examines the ethical issues arising around big data and provides frameworks, context, concepts, and theories to help students think through and deal with the issues as they encounter them in their professional lives.
  • Machine Learning II: Data Mining and Statistical Learning (3 credits)
  • DS 6050 Machine Learning III: Deep Learning (3 credits)
    • A graduate-level course on deep learning fundamentals and applications with emphasis on their broad applicability to problems across a range of disciplines. Topics include regularization, optimization, convolutional networks, sequence modeling, generative learning, instance-based learning, and deep reinforcement learning. Students will complete several substantive programming assignments. A course covering statistical techniques such as regression.
  • Restricted Elective (3 credits)
  • Capstone Program Project Preparation (non-credit workshop) 

Term 3, Summer or Fall

Summer Term

9 credits

  • DS 6015 Data Science Capstone (3 credits)
  • DS 5110 Data Engineering II:  Big Data Systems (3 credits)
    • Scalable big data systems are a central part of modern data science. This course will cover topics including design and use of parallel dataflow systems (MapReduce/Hadoop and Spark), scalable and parallel Python analytics frameworks, and cloud data systems (cloud storage, cloud-native data processing). A major component of this course is hands-on programming using scalable analytics tools and cloud resources such as Google Cloud and Azure Cloud.
  • Restricted Elective (3 credits)

Graduate in August 

OR

Participate in a pre-approved qualified internship over the Summer term, receiving course credit in the Fall term

Fall Term

  • DS 5110 Data Engineering II:  Big Data Systems (3 credits)
    • Scalable big data systems are a central part of modern data science. This course will cover topics including design and use of parallel dataflow systems (MapReduce/Hadoop and Spark), scalable and parallel Python analytics frameworks, and cloud data systems (cloud storage, cloud-native data processing). A major component of this course is hands-on programming using scalable analytics tools and cloud resources such as Google Cloud and Azure Cloud.
  • Restricted Elective (3 credits)
  • Additional Elective if desired

Graduate in December 

Sample Electives

Students select their elective courses in consultation with the Program Director. A variety of electives are available, including but not limited to those listed in the Graduate Record. Students are required to take a minimum of 6 total credit hours of elective courses. Elective courses must be at the 5000 level or higher to count toward the MSDS program unless otherwise preapproved. Examples of electives:

  • CS 6160: Theory of Computation
  • CS 6444: Parallel Computing
  • CS 6501: Special Topics in Computer Science (e.g.: Text Mining, Cloud Computing, Defense Against the Dark Arts, Vision & Language)
  • CS 6750: Database Systems
  • ECON 8720: Time Series Econometrics
  • ECON 7720: Econometrics II
  • EVSC 7070: Advanced Use of Geographical Information Systems
  • GCOM 7240: Advanced Quantitative Analysis
  • PHS 5705: Recent Advances in Public Health Genomics
  • PHS 7310: Clinical Trials Methodology
  • PSYC 5720: Fundamentals of Item Response Theory
  • PSYC 7760: Introduction to Applied Multivariate Methods
  • SARC 5400: Data Visualization
  • STAT 6250: Longitudinal Data Analysis
  • STAT 6260: Categorical Data Analysis
  • SYS 6023: Cognitive Systems Engineering
  • SYS 6050: Risk Analysis
  • SYS 6582: Selected Topics in Systems Engineering (e.g.: Reinforcement Learning, User Experience Design, Sensors & Perception)
  • SYS 7001: System and Decision Sciences

Availability of electives varies by year, and courses must be approved by the School of Data Science. Students interested in taking more than 6 credit hours of electives will need to obtain faculty approval.

Learn More About the Genomics Focus

The Graduate Record represents the official repository for academic program requirements.

Alumni Testimonial

Arishya

“During my internship with Apple, I did a lot of data engineering and data science. I was able to work with a lot of the technologies that I learned in the MSDS program and am excited to take what I learned and apply it to my work.” — Arishya Ansari, MSDS 2021, Machine Learning Engineer, Apple (San Francisco, California)

View All