CS 2881 AI Safety

Fall 2025, Thursdays 3:45pm-6:30pm (First lecture September 4)

Course: CS 2881R - AI Safety

Instructor: Boaz Barak

Teaching Fellows: Natalie Abreu (natalieabreu@g.harvard.edu), Roy Rinberg (royrinberg@g.harvard.edu), Hanlin Zhang (hanlinzhang@g.harvard.edu)

Course Description: This will be a graduate level course on challenges in alignment and safety of artificial intelligence. We will consider both technical aspects as well as questions on societal and other impacts of the field.

Prerequisites: We require mathematical maturity, and proficiency with proofs, probability, and information theory, as well as the basics of machine learning, at the level of an undergraduate ML course such as Harvard CS 181 or MIT 6.036. You should be familiar with topics such as empirical and population loss, gradient descent, neural networks, linear regression, principal component analysis, etc. On the applied side, you should be comfortable with Python programming, and be able to train a basic neural network.

Important: Read the Course Introduction!

Questions? If you have any questions about the course, please email harvardcs2881@gmail.com

Related reading by Boaz:

Previous versions: Spring 2023 ML Theory Seminar Spring 2021 ML Theory Seminar

Mini Syllabus

Schedule

Classes begin September 2, 2025. Reading period December 4-9, 2025.

Note: This schedule is periodically synchronized with the course schedule Google Doc, which contains the most up-to-date version.

Thursday, September 25, 2025
Model Specifications & Compliance
Thursday, October 16, 2025
Recursive Self-Improvement
  • Is AI R&D an "AI-complete" task?
Experiment:
To be determined
Thursday, October 30, 2025
Military & Surveillance Applications of AI
  • Lethal autonomous weapon systems (LAWS)
  • Strategic stability & escalation risks
  • Mass-scale surveillance infrastructure
Experiment:
To be determined
Thursday, November 13, 2025
Emotional Reliance and Persuasion
  • Domestic & international regulatory approaches
  • Standards-setting & audits
Experiment:
To be determined
Resources:
  • Resources to be determined
Thursday, November 20, 2025
TBD
  • Topics to be determined
Experiment:
To be determined
Resources:
  • Resources to be determined
No lecture on Thursday, November 27 – Thanksgiving Break
Thursday, December 4, 2025
AI 2035 - Possible Futures of AI
  • Student project presentations and discussion of future directions in AI safety research
Resources:
  • Resources to be determined

Schedule content is synchronized with the course schedule document