NOTE: This is the web page for the Spring 2021 version of the course. See Spring 2023 version
See blog posts for (almost) all lectures in this seminar
Official teaching fellows: Yamini Bansal ybansal (at) g.harvard.edu, Javin Pombra
Unofficial teaching fellows: Gal Kaplun galkaplun (at) g.harvard.edu Dimitris Kalimeris kalimeris (at) g.harvard.edu Preetum Nakkiran preetum (at) cs.harvard.edu
See home page for Harvard CS 229r and MIT 18.408.
Introductory blog post by Boaz: Machine Learning Theory with Bad Drawings
Course description: This will be a graduate level course on recent advances and open questions in the theory of machine learning and specifically deep learning. We will review both classical results as well as recent papers in areas including classifiers and generalization gaps, representation learning, generative models, adversarial robustness and out of distribution performance, and more.
This is a fast-moving area and it will be a fast-moving course. We will aim to cover both state-of-art results, as well as the intellectual foundations for them, and have a substantive discussion on both the “big picture” and technical details of the papers. In addition to the theoretical lectures, the course will involve a programming component aiming to get students to the point where they can both reproduce results from papers and work on their own research. This component will be largely self-directed and we expect students to be proficient in Python and in picking up technologies and libraries on their own (aka “Stack Overflow oriented programming”). We will ensure students have access to the appropriate computational resources (i.e., GPUs). MIT “Sister seminar”: This Harvard seminar will be coordinated with a “sister seminar” at MIT, taught by Ankur Moitra. We recommend that students taking CS 229br also take the MIT course, but this is not required. The two courses will share some but not all lectures and assignments. So, if you take CS 229br, please keep the Wednesday 12-3 slot free as well. Prerequisites (for both CS 229br and MIT 18.408): Both courses will require mathematical maturity, and proficiency with proofs, probability, and information theory, as well as the basics of machine learning. We expect that students will have both theory background (at Harvard: CS 121 and CS 124 or similar, at MIT: 6.046 or similar) as well as machine learning background (at Harvard: CS 181 or 183 or similar, at MIT: 6.036 or similar).
blog posts for (almost) all lectures in this seminar
Pre lecture introductory blog.
Monday, January 25: Introduction to course, blitz through classical learning theory, Zhang et al experiments. lecture notes (blog) - slides (pdf) - slides (Powerpoint with animations and annotation) - video
Monday, February 1: Learning dynamics: over-parameterized linear regression, deep linear networks, simplicity bias, early layers, lower bound for parities. lecture notes (blog) - slides (pdf) - slides (Powerpoint with animation and annotation) - video
Monday, February 8: Unsupervised learning and generative models (Auto-encoders, Variational Auto-encoders, Flow based models, auto-regressive models, GANs), representation learning (contrastive learning). lecture notes (blog) - slides (pdf) - slides (Powerpoint with animation and annotation) - video
Monday, February 22: Robust statistics, out of distribution performance, robustness to data poisoning and adversarial perturbation attacks lecture notes (blog) - lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video
Monday, March 8: Variational inference, statistical physics (Boltzmann distribution, mean-field models, a bit of replica method) lecture notes (blog) - lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video
Monday, March 15: Natural Language processing, guest lecture by Sasha Rush lecture notes (blog)
Monday, March 22: Theoretical neuroscience, visualizing and interpreting neural networks, guest lecture by Chris Olah
Monday, March 29: Causality and Fairness lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video
Monday, April 5: Bandits, contextual bandits, and reinforcement learning , guest lecture by Sham Kakade lecture notes (blog} - video of lecture. Lecture slides: Original form: main / bandit analysis. Annotated: main / bandit analysis.
Monday, April 12: Privacy lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video
Monday, April 19: TBD
Monday, April 26: TBD
Yamini Bansal: Mondays, 5PM - 6PM EST (Message me on Slack and we can start a Zoom call!)
Is there a complete plan of all lectures and assignments?
No - this course will be an experiment, for both me and the students, and we will figure out how much we can cover and in what way as we go along. The goal is to start with some of the foundations and to get quickly to talk about recent papers. The intention is that students will get to the point where they can read (and sometimes also can reproduce) recent ML papers, and hopefully also be able to generate new insights.
What will the format of the course be like?
We will have weekly lectures/discussions, and experimental homeworks/projects. The lectures will focus on describing and discussing papers and theory, while problem sets / projects will be more empirical. We will have formal or informal “sections” where the unofficial TFs will help out in technical issues with implementations, but we will also rely on students looking up material and helping one another.
What is expected out of students?
Students will be expected to do some reading before lectures, and to work on some experimental homework assignments, typically involving reproducing a paper, or trying out some experiment. The lecture will not discuss how to run experiments or implement neural networks, but the teaching fellows will be available. We will also expect students to look up resources on their own (such as this excellent deep learning course of LeCun & Canziani) and to help one another. There will also be some project, and students might potentially also need to write scribe notes for one lecture.
How will students be graded?
The course is intended for graduate students or advanced undergraduate students who have mostly completed their requirements but are deeply interested in the material for its own sake. The method of grading will be decided later on. At the moment we have several “unofficial TFs” that are spending effort in designing assignments that will get you better at being able to run your own experiments, but we don’t have any official TFs. We will try to find ways that you can get feedback on your work, even if we don’t have the resources to grade it.