DA 5001 / DA 6400: Privacy in AI

Jul-Nov 2024 @ IIT Madras in CRC 205 in Slot J (weekly schedule)

Instructor: Krishna Pillutla

Announcements     Piazza     Gradescope


Course Description

As AI rapidly advances, so do concerns about data privacy and responsible data usage. This course explores these areas with foundational mathematical principles, cutting-edge research, and practical applications. We will focus on rigorous mathematical tools (with proofs), algorithm design for practical applications, and implementation.

Topics: privacy risks, differential privacy & properties, private learning algorithms, protecting against data reconstruction. We will also cover a sampling of advanced topics including privacy in distributed and federated learning, emerging challenges of privacy and copyright in LLMs and GenAI, federated learning, and unlearning.

See the course calendar for up-to-date syllabus.

Logistics

We will use Piazza for communication and Gradescope for submitting assignments, project reports, etc.

Grading

  • Homeworks: 35% (5% for HW0, 10% each for HW1-3)
  • Midterm: 20%
  • Course Project: 40%
    • Proposal: 5%
    • Midpoint Report: 10%
    • Presentation: 10%
    • Final Report: 15%
  • Scribing: 5%

Homework

We will have 4 homeworks. HW0, released on the first day of class, is meant to be a review (5% of the grade). The other 3 homeworks are worth 10% each.

  • HW0: Released July 29, due August 9th at 11:59 PM.

Instructions

  • Please submit your solutions via Gradescope.
  • The assignments should be typeset in LaTeX (you may use this LaTeX template). Please do not submit handwritten notes.
  • For the mathematical problems, please be succinct and justify all the steps. Proofs are required to be fully rigorous and justified.
  • You have a total budget of 3 late days for homeworks (and scribing), no questions asked. A submission a few minutes after the deadline will also count as a full late day. Further delays after exhausting the late day budget will result in a zero grade for that homework.
  • No late days will be allowed for project-related deadlines.
  • For coding assignments, please submit add the exported PDF to the rest of your solutions and submit to gradescope. Please also separately submit your executable JuPyTer notebook to Gradescope.

Collaboration Policy

You can collaborate with others on the homework, provided:

  • You acknowledge everybody you worked with in your submission. Similarly, external resources you consulted should also be cited.
  • You write your own solutions and code independently and from scratch. You are required to do this without referring any material from joint discussions including written notes or photos. In other words, you must internalise any solution/code deeply enough to recreate it fully by yourself before submitting it as your own work.
  • Copy-pasting is strictly not allowed.

Course Project

The course will involve a final project to be performed in groups of 2 or 3 (exact details TBD).

The course project can be one of the three types:

  • An original research project: can be theory/applied/both. Your are welcome to work on your own research project, as long as it involves a component related to privacy in AI.
  • Implementation: benchmarking existing algorithms and open-sourcing by creating your own package or contributing to existing ones.
  • In-depth paper analysis: read and analyse the results of a theoretical paper, and reproduce the proofs in your own words.

The project will require a proposal (due end of September), a midpoint review (due 3rd week of October), a presentation (week of Nov. 4th), and a final report (due around Nov. 14th). These details will be annouced in the calendar.

Scribing

Please use this LaTeX template for scribing the lecture notes. Scribed notes are due one week from the lecture date at 11:59 pm. For example, the scribed notes for the lecture on Thursday, August 1st 2024 will be due at 11:59 pm on Thursday, August 8th 2024 via Gradescope.

A team of two students will handle each lecture. Each student will have to help scribe a few lectures. Please see the scribing schedule in this spreadsheet.

Resources

The references and reading (including book chapters and papers) for each lecture will be posted on the Calendar/Syllabus page. This will include parts of the following monographs/textbooks (PDFs available for free online):

  • Dwork & Roth (2014). The Algorithmic Foundations of Differential Privacy. PDF
  • Vadhan (2017). The Complexity of Differential Privacy. PDF
  • Near & Abuah (2021). Programming Differential Privacy. PDF & Notebooks

Honour Code

Here is the full honour code.

We fully expect and believe that you will conduct yourself with academic and personal integrity. While we will follow IITM policies, it is ultimately up to you to conduct yourself with integrity for several compelling reasons that go beyond this course.

Respect diversity: There is a place in this classroom and at IITM for everyone who is curious and passionate about exploring knowledge. Let us all be mindful of creating a welcoming and inclusive space.

As the next generation, you have the power to shape the future: aim to make the world a better place!