Overview
Foundation and Core courses are divided into 6 sequences, each with between 2 and 4 courses:
Computational Methods (3 courses) - Covering data science programming, numerical optimization and big data analytic methods
Machine Learning (4 courses) – Covering data analysis methods, including but not limited to linear models, neural networks, ensemble methods, clustering, etc.
Theory (3 courses) – Covering probability, mathematical statistics, and estimation theory
Data Engineering (1 course) – Covering applied data management and ML ops
Value (2 courses) – Covering ethical considerations of data science
Research Methods (1 course) – Covering professional development and research practices
Note: The topics listed below are meant to be representative of the topics covered in the course. The Graduate Record represents the official repository for academic program requirements.
Computational Methods
These courses cover data science programming, numerical optimization and big data analytic methods.
Computation I: Fundamentals
Year 1, Fall Term (Foundation)
- Basic and advanced data structures
- Searching and sorting
- Greedy algorithms
- Linear programming
- Basics of databases
Computation II: Numerical Analysis & Optimization
Year 1, Spring Term (Foundation)
- Numerical errors
- Root finding algorithms
- Numerical Linear Algebra
- Numerical optimization
Computation III: Distributed Computing
Year 2, Fall Term (Core)
- Spark for large-scale analytics
- Resilient distributed datasets
- Tools for data processes, storage, and retrieval (AWS)
Machine Learning
These courses cover a wide range of applied data analysis methods.
Machine Learning I
Introduction
Year 1, Fall Term (Foundation)
- Generalized Linear Model focused
- Regularization
- Model diagnostics/ evaluation
- Inference, both Bayesian and Frequentist
- Decision trees/ Random forests
Machine Learning II
Methods and Applications
Year 1, Spring Term (Foundation)
- Clustering methods
- SVMs
- Shallow neural nets
- Ensemble methods
- Dimension reduction
- Anomaly detection
- Recommender systems
Machine Learning III
Deep Learning
Year 2, Fall Term (Core)
- Deep neural networks
- Convolution and recurrent NNs
- Transformers
- Encoder/decoder
- Generative networks
- GANs
- LSTM
Machine Learning IV
Advanced Topics
Year 2, Spring Term (Core)
- Reinforcement Learning
- Natural Language Processing
- Bayesian Machine Learning
Theory
This sequence covers mathematical statistics, probability theory, and advanced linear algebra for statistics/data science applications.
Theory I
Probability and Stochastic Processes
Year 1, Fall Term (Foundation)
- Probability theory
- Random variables
- Univariate estimation
- Limit theorems
- Stochastic processes
Theory II
Inference and Prediction
Year 1, Spring Term (Foundation)
- Frequentist, likelihood & Bayesian Inference
- MLE & Method of Moments estimation
- Information theory
- Resampling theory
- Computational algorithms (e.g. EM, MCMC)
Theory III
Linear Models
Year 2, Fall Term (Core)
- Linear algebra for linear models
- Projections, null spaces, SVD, eigenvalues/vectors.
- Linear model fitting, estimation and prediction.
Data Engineering
This sequence covers “practical” considerations for data scientists.
Data Engineering I
Data Management & Visualization
Year 1, Fall Term (Foundation)
- Local environments
- Containers
- Using cloud compute
- Webscraping
- Dashboard development
- Data wrangling /management
- Data visualization
Data Engineering II
ML Ops
Year 2, Spring Term (Core)
- Scalability, performance & security
- Handling resource constraints
- Streaming data collection
- Cutting-edge database technologies
Value
This sequence covers ethical considerations for data scientists.
Value I
Data Ethics, Policy & Governance
Year 1, Fall Term (Core)
- Fairness, accountability & transparency
- Digital rights & regulatory frameworks
- Harm, bias & discrimination
- Policy design, implementation & evaluation
- International/multistakeholder agreements
- Open movement
- Technical debt
- Data justice
Value II
Data and Society
Year 2, Spring Term (Core)
- Applied research and case studies re: data ethics/policy/governance
- Auditing algorithms & ADS
- Risk assessment tools
- Civic tech & community engagement
- Evidence-based policymaking
- Alternative data governance models, data trusts and data stewardships
- Responsible innovation
Research Methods
Data Science Research Methodology
Year 2, Spring Term (Core)
This course helps students transition into principal investigators and generators of data science-based knowledge by developing practical skills necessary to conduct high quality data science research, advance development into producers and critical consumers of research, and further development into professional data scientists broadly defined.
Data Science Research Methodology
Year 2, Spring Term (Core)
Miscellaneous Requirements
- Students are required to attend the School of Data Science seminar every week
- Students spend the first summer working with faculty in labs
- There will be a student evaluation after the second year