Harvard CS 229br: Advanced Topics in the theory of machine learning
Unofficial teaching fellows: Yamini Bansal ybansal (at) g.harvard.edu Gal Kaplun galkaplun (at) g.harvard.edu Dimitris Kalimeris kalimeris (at) g.harvard.edu Preetum Nakkiran preetum (at) cs.harvard.edu
Introductory blog post by Boaz: Machine Learning Theory with Bad Drawings
Course description: This will be a graduate level course on recent advances and open questions in the theory of machine learning and specifically deep learning. We will review both classical results as well as recent papers in areas including classifiers and generalization gaps, representation learning, generative models, adversarial robustness and out of distribution performance, and more.
This is a fast-moving area and it will be a fast-moving course. We will aim to cover both state-of-art results, as well as the intellectual foundations for them, and have a substantive discussion on both the “big picture” and technical details of the papers. In addition to the theoretical lectures, the course will involve a programming component aiming to get students to the point where they can both reproduce results from papers and work on their own research. This component will be largely self-directed and we expect students to be proficient in Python and in picking up technologies and libraries on their own (aka “Stack Overflow oriented programming”). We will ensure students have access to the appropriate computational resources (i.e., GPUs). MIT “Sister seminar”: This Harvard seminar will be coordinated with a “sister seminar” at MIT, taught by Ankur Moitra. We recommend that students taking CS 229br also take the MIT course, but this is not required. The two courses will share some but not all lectures and assignments. So, if you take CS 229br, please keep the Wednesday 12-3 slot free as well. Prerequisites (for both CS 229br and MIT 18.408): Both courses will require mathematical maturity, and proficiency with proofs, probability, and information theory, as well as the basics of machine learning. We expect that students will have both theory background (at Harvard: CS 121 and CS 124 or similar, at MIT: 6.046 or similar) as well as machine learning background (at Harvard: CS 181 or 183 or similar, at MIT: 6.036 or similar).
Tentative plan and lecture slides
(Plans for future lectures are very tentative. Slides with animation and annotation are posted after the lecture)
Pre lecture introductory blog.
Monday, January 25: Introduction to course, blitz through classical learning theory, Zhang et al experiments. lecture slides (pdf) - lecture slides (Powerpoint with animations and annotation) - lecture notes (blog) - video
Monday, February 1: Learning dynamics: over-parameterized linear regression, deep linear networks, simplicity bias, early layers, lower bound for parities. lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - lecture notes (blog) - video
Monday, February 8: Unsupervised learning and generative models (Auto-encoders, Variational Auto-encoders, Flow based models, auto-regressive models, GANs), representation learning (contrastive learning). lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video
Monday, February 22: Out of distribution performance, adversarial robustness, transfer learning, meta learning lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video
Monday, March 8: Transfer learning, meta learning
Monday, March 15: Natural Language processing, guest lecture by Sasha Rush
Monday, March 22: Theoretical neuroscience, visualizing and interpreting neural networks, guest lecture by Chris Olah
Monday, March 29: Bandits, contextual bandits, and reinforcement learning
Monday, April 5: Control theory
Monday, April 12: Causality
Monday, April 19: privacy, fairness
Monday, April 26: Statistical physics view of machine learning
Yamini Bansal: Mondays, 5PM - 6PM EST (Message me on Slack and we can start a Zoom call!)
Is there a complete plan of all lectures and assignments?
No - this course will be an experiment, for both me and the students, and we will figure out how much we can cover and in what way as we go along. The goal is to start with some of the foundations and to get quickly to talk about recent papers. The intention is that students will get to the point where they can read (and sometimes also can reproduce) recent ML papers, and hopefully also be able to generate new insights.
What will the format of the course be like?
We will have weekly lectures/discussions, and experimental homeworks/projects. The lectures will focus on describing and discussing papers and theory, while problem sets / projects will be more empirical. We will have formal or informal “sections” where the unofficial TFs will help out in technical issues with implementations, but we will also rely on students looking up material and helping one another.
What is expected out of students?
Students will be expected to do some reading before lectures, and to work on some experimental homework assignments, typically involving reproducing a paper, or trying out some experiment. The lecture will not discuss how to run experiments or implement neural networks, but the teaching fellows will be available. We will also expect students to look up resources on their own (such as this excellent deep learning course of LeCun & Canziani) and to help one another. There will also be some project, and students might potentially also need to write scribe notes for one lecture.
How will students be graded?
The course is intended for graduate students or advanced undergraduate students who have mostly completed their requirements but are deeply interested in the material for its own sake. The method of grading will be decided later on. At the moment we have several “unofficial TFs” that are spending effort in designing assignments that will get you better at being able to run your own experiments, but we don’t have any official TFs. We will try to find ways that you can get feedback on your work, even if we don’t have the resources to grade it.