Use data to drive decisions
The online M.S. in Data Science (MSDS) offers an integrated curriculum centered on excellence in data science that features interdisciplinary study and practical, hands-on learning projects. Designed outside the traditional curricular structure, the MSDS is a forward-looking blueprint of the world in which data science plays an increasingly important role.
We believe you shouldn’t have to compromise if you want to pursue a rigorous MSDS program without disrupting your life or job. Our online MSDS program is rigorous, but flexible with timing. It is time-tested, well-known, and taught by professors with vast experience in industry and academia.
The MSDS curriculum is tightly prescribed. Courses are interdisciplinary and taught by faculty at the School of Data Science and from across the University of Virginia. Several large data sets are used between courses to increase the program’s cohesion.
Curriculum for students starting the program in Fall 2025 or later term
The online MSDS program is designed around a spiral learning framework. Students begin by acquiring a foundation in languages, computation, and linear modeling. They then build upon those skills and explore Bayesian machine learning, statistics, data mining and text analytics, computer programming, and data ethics, as well as interdisciplinary electives from across the University. Throughout the program, students apply what they learn and practice effective communication skills through hands-on group projects.
At the conclusion of the MSDS program, students address an important data science challenge through a sponsored team capstone project. Projects connect you with corporate and government partners who present unique challenges for you to tackle through hands-on learning. By applying the tools and techniques learned in the classroom, you gain real-world experience while providing the sponsoring organization valuable data-driven insights and solutions.
The 33-credit-hour online M.S. in Data Science program is offered across five terms. Students take a core set of courses with elective courses offered in semesters 4 and 5. For course details and descriptions by term, see below.
Weekly Breakdown
The online M.S. in Data Science Program employs a flipped classroom model to accommodate the needs of students balancing professional and academic goals. Flipped classrooms are learner-focused environments that allow students to explore course topics independently, outside of class, using asynchronous course content. Live classroom time serves as a forum for class discussion aimed at deepening student understanding of course material. Students also have the option to attend live faculty office hours and group study sessions throughout the week.
Each week, students can expect one hour of live synchronous time per course, which will be scheduled in the evenings (ET) during weekdays. On a weekly basis, most part-time online MSDS students typically spend 10 to 12 hours per course completing asynchronous content, which includes homework, projects, and readings. However, the workload can vary by student and by course, and total hours may depend on learning styles and level of comfort with program prerequisites.
Semester 1
DS 6001 - Data Engineering I: Data Pipeline Architecture (3 credits)
Covers the essential environments and tools for data engineering. Topics include Linux, software development and testing, database design and construction, creation and deployment of containers, and data load/transform/extraction.
DS 5030 - Understanding Uncertainty (3 credits)
DS 6002 - Ethics of Big Data (3 credits)
This course examines the ethical issues arising around big data and provides frameworks, context, concepts, and theories to help students think through and deal with the issues as they encounter them in their professional lives.
Semester 2
DS 5012 - Computation for Data Science (3 credits)
DS 6021 - Machine Learning I: Introduction to Predictive Modeling (3 credits)
Semester 3
DS 6030 - Machine Learning II: Data Mining and Statistical Learning (3 credits)
DS 5110 - Data Engineering II: Big Data Systems (3 credits)
Scalable big data systems are a central part of modern data science. This course will cover topics including design and use of parallel dataflow systems (MapReduce/Hadoop and Spark), scalable and parallel Python analytics frameworks, and cloud data systems (cloud storage, cloud-native data processing). A major component of this course is hands-on programming using scalable analytics tools and cloud resources such as Google Cloud and Azure Cloud.
Semester 4
Restrictive Elective (3 credits)
DS 6050 - Machine Learning III: Deep Learning (3 credits)
A graduate-level course on deep learning fundamentals and applications with emphasis on their broad applicability to problems across a range of disciplines. Topics include regularization, optimization, convolutional networks, sequence modeling, generative learning, instance-based learning, and deep reinforcement learning. Students will complete several substantive programming assignments. A course covering statistical techniques such as regression.
Semester 5
DS 6015 - Data Science Capstone (3 credits)
Restrictive Elective (3 credits)
Possible Electives
Below is a sample of electives that have been offered previously in the online MSDS program. Availability of electives may vary by semester. The development of additional electives is underway.
DS 5001 - Exploratory Text Analytics (3 credits)
Introduction to text analytics with a focus on long-form documents, such as reviews, news articles, and novels. Students convert source texts into structure-preserving analytical form and then apply information theory, NLP tools, and vector-based methods to explore language models, topic models, sentiment analyses, and narrative structures. The focus is on unsupervised methods to explore cognitive and social patterns in texts.
DS 5002 - How to Train Your LLM: Engineering LLMs for Custom Tasks (3 credits)
Train your own LLM for a custom task. Learn about the LLM lifecycle from architecture, to pre-training, to supervised finetuning, to deployment, to model editing/updating, including discussing LLM limitations. End up with your own trained LLM, a HuggingFace model card you can show off in technical interviews, and a plan for how to stay up to date with this fast-moving field.
DS 5003 - Healthcare for Data Science (3 credits)
Provides healthcare domain knowledge, healthcare data understanding, and data science methodologies to solve problems. Understand data types, models, and sources, including electronic health record data; health outcomes, quality, risk, and safety data; and unstructured data, such as clinical text data; biomedical sensor data; and biomedical image data. Querying with SQL, data visualization with Tableau, and analysis and prediction with Python.
DS 5400 - Business Analytics for Data Scientists (3 credits)
This course focuses on the application of data science to critical problems and opportunities in business. You will learn business concepts in strategy, markets and competition, and will apply data science to analytical projects in operations, marketing, human resources and finance. Additional topics include experimentation, business cases, team leadership and executive communication. Students will use Python or R, and Dataiku DSS.
DS 6040 - Bayesian Machine Learning (3 credits)
Bayesian inferential methods provide a foundation for machine learning under conditions of uncertainty. Bayesian machine learning techniques can help us to more effectively address the limits to our understanding of world problems. This class covers the major related techniques, including Bayesian inference, conjugate prior probabilities, naive Bayes classifiers, expectation maximization, Markov chain Monte Carlo, and variational inference.
SARC 5400 - Data Visualization (3 credits)
Thinking with Images. People have been looking at data for centuries — with their eyes — to discover patterns, meaning, and insight into the most important challenges of their time. This course teaches visual and spatial thinking coupled with visual data tools and interactive web coding to envision information. Far beyond plotting, finding ways to respond to complex problems, we will study and make useful, compelling, and beautiful tools to see.
The Graduate Record represents the official repository for academic program requirements.