Overview 

Foundation and Core courses are divided into 6 sequences, each with between 2 and 4 courses: 

  • Computational Methods (3 courses) - Covering data science programming, numerical optimization and big data analytic methods 

  • Machine Learning (4 courses) – Covering data analysis methods, including but not limited to linear models, neural networks, ensemble methods, clustering, etc. 

  • Theory (3 courses) – Covering probability, mathematical statistics, and estimation theory 

  • Data Engineering (2 courses) – Covering applied data management and ML ops 

  • Value (2 courses) – Covering ethical considerations of data science 

  • Research Methods (1 course) – Covering professional development and research practices 

Note: The topics listed below are meant to be representative of the topics covered in the course, rather than being an exhaustive and exclusive list of topics covered. Detailed lists of topics for each course are available. 

Machine Learning

These courses cover a wide range of applied data analysis methods.  

Machine Learning I
Introduction
Year 1, Fall Term (Foundation)

  • General/ized Linear Model focused 

  • Regularization 

  • Model diagnostics/ evaluation 

  • Inference, both Bayesian and Frequentist 

  • Decision trees/ Random forests 

Machine Learning II
Methods and Applications
Year 1, Spring Term (Foundation)

  • Clustering methods 

  • SVMs  

  • Shallow neural nets 

  • Ensemble methods 

  • Dimension reduction 

  • Anomaly detection 

  • Recommender systems 

Machine Learning III
Deep Learning
Year 2, Fall Term (Core)

  • Deep neural networks 

  • Convolution and recurrent NNs 

  • Transformers 

  • Encoder/decoder 

  • Generative networks 

  • GANs 

  • LSTM  

Machine Learning IV
Topic TBD
Year 2, Spring Term (Core)

  • Reinforcement Learning 
  • Natural Language Processing 
  • Bayesian Machine Learning 

Theory

This sequence covers mathematical statistics, probability theory, and advanced linear algebra for statistics/data science applications.  

Theory I
Probability and Stochastic Processes 
Year 1, Fall Term (Foundation)

  • Probability theory 

  • Random variables 

  • Univariate estimation 

  • Limit theorems 

  • Stochastic processes 

Theory II
Inference and Prediction 
Year 1, Spring Term (Foundation)

  • Frequentist, likelihood & Bayesian Inference 

  • MLE & Method of Moments estimation 

  • Information theory 

  • Resampling theory 

  • Computational algorithms (e.g. EM, MCMC) 

Theory III
Linear Models
Year 2, Fall Term (Core)

  • Linear algebra for linear models 

  • Projections, null spaces, SVD, eigenvalues/vectors. 

  • Linear model fitting, estimation and prediction.   

Data Engineering

This sequence covers “practical” considerations for data scientists. 

Data Engineering I
Data Pipelines & Visualization  
Year 1, Fall Term (Foundation)

  • Local environments 

  • Containers 

  • Using cloud compute 

  • Webscraping 

  • Dashboard development 

  • Data wrangling /management 

  • Data visualization

Data Engineering II
ML Ops   
Year 2, Spring Term (Core)

  • Scalability, performance & security 

  • Handling resource constraints 

  • Streaming data collection 

  • Cutting edge database technologies

Value

This sequence covers ethical considerations for data scientists. 

Value I
Data Ethics, Policy & Governance 
Year 1, Fall Term (Foundation)

  • Fairness, accountability & transparency 

  • Digital rights & regulatory frameworks 

  • Harm, bias & discrimination 

  • Policy design, implementation & evaluation 

  • International/multistakeholder agreements 

  • Open movement 

  • Technical debt 

  • Data justice 

Value II
Data and Society 
Year 2, Spring Term (Core)

  • Applied research and case studies re: data ethics/policy/governance 

  • Auditing algorithms & ADS 

  • Risk assessment tools 

  • Civic tech & community engagement 

  • Evidence based policy making 

  • Alternative data governance models, data trusts and data stewardships 

  • Responsible innovation 

Research Methods

This course covers professional development and how data science research is done. This course exposes students to landmark papers, discusses how to participate in and evaluate academic research, and has students practice academic writing. Students develop a research proposal or literature review during this course. 
 

Research Methods I
Year 2, Spring Term (Core)
Topics TBD

Miscellaneous Requirements

  • Students are required to attend the SDS seminar every week 

  • Students spend the first summer working with faculty in labs 

  • There will be a student evaluation after the second year