Overview 

Foundation and Core courses are divided into 6 sequences, each with between 2 and 4 courses: 

  • Computational Methods (3 courses) - Covering data science programming, numerical optimization and big data analytic methods 

  • Machine Learning (4 courses) – Covering data analysis methods, including but not limited to linear models, neural networks, ensemble methods, clustering, etc. 

  • Theory (3 courses) – Covering probability, mathematical statistics, and estimation theory 

  • Data Engineering (1 course) – Covering applied data management and ML ops 

  • Value (2 courses) – Covering ethical considerations of data science 

  • Research Methods (1 course) – Covering professional development and research practices 

Note: The topics listed below are meant to be representative of the topics covered in the course. The Graduate Record represents the official repository for academic program requirements.

Computational Methods

These courses cover data science programming, numerical optimization and big data analytic methods.

Computation I: Fundamentals
Year 1, Fall Term (Foundation)

  • Basic and advanced data structures
  • Searching and sorting
  • Greedy algorithms
  • Linear programming
  • Basics of databases

Computation II: Numerical Analysis & Optimization
Year 1, Spring Term (Foundation)

  • Numerical errors
  • Root finding algorithms
  • Numerical Linear Algebra
  • Numerical optimization

Computation III: Distributed Computing
Year 2, Fall Term (Core)

  • Spark for large-scale analytics
  • Resilient distributed datasets
  • Tools for data processes, storage, and retrieval (AWS)

Machine Learning

These courses cover a wide range of applied data analysis methods.  

Machine Learning I
Introduction
Year 1, Fall Term (Foundation)

  • Generalized Linear Model focused
  • Regularization 
  • Model diagnostics/ evaluation 
  • Inference, both Bayesian and Frequentist 
  • Decision trees/ Random forests 

Machine Learning II
Methods and Applications
Year 1, Spring Term (Foundation)

  • Clustering methods 
  • SVMs  
  • Shallow neural nets 
  • Ensemble methods 
  • Dimension reduction 
  • Anomaly detection 
  • Recommender systems 

Machine Learning III
Deep Learning
Year 2, Fall Term (Core)

  • Deep neural networks 
  • Convolution and recurrent NNs 
  • Transformers 
  • Encoder/decoder 
  • Generative networks 
  • GANs 
  • LSTM  

Machine Learning IV
Advanced Topics
Year 2, Spring Term (Core)

  • Reinforcement Learning 
  • Natural Language Processing 
  • Bayesian Machine Learning 

Theory

This sequence covers mathematical statistics, probability theory, and advanced linear algebra for statistics/data science applications.  

Theory I
Probability and Stochastic Processes 
Year 1, Fall Term (Foundation)

  • Probability theory 
  • Random variables 
  • Univariate estimation 
  • Limit theorems 
  • Stochastic processes 

Theory II
Inference and Prediction 
Year 1, Spring Term (Foundation)

  • Frequentist, likelihood & Bayesian Inference 
  • MLE & Method of Moments estimation 
  • Information theory 
  • Resampling theory 
  • Computational algorithms (e.g. EM, MCMC) 

Theory III
Linear Models
Year 2, Fall Term (Core)

  • Linear algebra for linear models 
  • Projections, null spaces, SVD, eigenvalues/vectors. 
  • Linear model fitting, estimation and prediction.   

Data Engineering

This sequence covers “practical” considerations for data scientists. 

Data Engineering I
Data Management & Visualization  
Year 1, Fall Term (Foundation)

  • Local environments 
  • Containers 
  • Using cloud compute 
  • Webscraping 
  • Dashboard development 
  • Data wrangling /management 
  • Data visualization

Data Engineering II
ML Ops   
Year 2, Spring Term (Core)

  • Scalability, performance & security 
  • Handling resource constraints 
  • Streaming data collection 
  • Cutting-edge database technologies

Value

This sequence covers ethical considerations for data scientists. 

Value I
Data Ethics, Policy & Governance 
Year 1, Fall Term (Core)

  • Fairness, accountability & transparency 
  • Digital rights & regulatory frameworks 
  • Harm, bias & discrimination 
  • Policy design, implementation & evaluation 
  • International/multistakeholder agreements 
  • Open movement 
  • Technical debt 
  • Data justice 

Value II
Data and Society 
Year 2, Spring Term (Core)

  • Applied research and case studies re: data ethics/policy/governance 
  • Auditing algorithms & ADS 
  • Risk assessment tools 
  • Civic tech & community engagement 
  • Evidence-based policymaking 
  • Alternative data governance models, data trusts and data stewardships 
  • Responsible innovation 

Research Methods

Data Science Research Methodology
Year 2, Spring Term (Core)

This course helps students transition into principal investigators and generators of data science-based knowledge by developing practical skills necessary to conduct high quality data science research, advance development into producers and critical consumers of research, and further development into professional data scientists broadly defined. 

Data Science Research Methodology
Year 2, Spring Term (Core)
 

Miscellaneous Requirements

  • Students are required to attend the School of Data Science seminar every week 
  • Students spend the first summer working with faculty in labs 
  • There will be a student evaluation after the second year