Phillip Isola

Massachusetts Institute of Technology

[intermediate] Deep Generative Models

Summary

Many of the most impressive AI demos of the last few years have been about synthesizing data. We now have systems that can write compelling stories, hallucinate photorealistic scenes, and synthesize beautiful melodies. What these systems all have in common is that they are deep generative models, and the math and engineering that underlies them is in fact quite simple. My lectures will cover deep generative models from the ground up. I will start with the fundamental principles of generative modeling, then describe popular modern algorithms (including GANs, VAEs, autoregressive models, and diffusion models), and finally cover the many applications of these models. This course will especially emphasize the question “why are generative models useful, and what can you do with them?” I will present four views on their uses: 1) they synthesize novel but realistic data, 2) they can learn powerful latent representations that make data “steerable”, 3) they solve structured, multimodal prediction problems, and 4) they are a tool for counterfactual reasoning.

Syllabus

Fundamentals of generative modeling: density functions, energy functions, and samplers; max likelihood modeling
A tour of popular models: GANs, VAEs, autoregressive models, diffusion models
Use 1: synthesizing novel but realistic data
Use 2: learning latent representations of data; steerable data, differentiable data
Use 3: structured prediction; examples from image-to-image translation, text-to-image
Use 4: counterfactual reasoning; applications towards explainable AI, world models

References

Textbook chapter on deep generative models:

Goodfellow, Bengio, Courville, “Deep Learning”, Chapter 20 [https://www.deeplearningbook.org/contents/generative_models.html]

Popular models:

Goodfellow et al., “Generative Adversarial Nets” (GAN) [https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf]
Kingma & Welling, “Auto-encoding variational bayes” (VAE) [https://arxiv.org/abs/1312.6114]
Ho et al., “Denoising Diffusion Probabilistic Models” (a diffusion model) [https://arxiv.org/abs/2006.11239]
Radford et al., “Improving Language Understanding by Generative Pre-Training” (GPT, an autoregressive model) [https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf]

What can you do with these models?:

Bilodeau et al., “Generative models for molecular discovery: Recent advances and challenges” [https://wires.onlinelibrary.wiley.com/doi/pdfdirect/10.1002/wcms.1608]
Jahanian et al., “On the steerability of generative adversarial networks” [https://arxiv.org/abs/1907.07171]
Isola et al., “Image-to-Image Translation with Conditional Adversarial Networks” (pix2pix) [https://arxiv.org/abs/1611.07004]
Zhu et al., “Unpaired Image-to-Image Translation with Cycle-Consistent Adversarial Networks” (CycleGAN) [https://arxiv.org/abs/1703.10593]
Ramesh et al, “Hierarchical Tex-Conditional Image Generation with CLIP Latents” (DALL-E 2) [https://cdn.openai.com/papers/dall-e-2.pdf]
Lang et al., “Explaining in Style: Training a GAN to explain a classifier in StyleSpace” [https://arxiv.org/abs/2104.13369]
Ha & Schmidhuber, “World Models” [https://arxiv.org/abs/1803.10122]

Pre-requisites

Students should have an introductory background in machine learning and deep learning. Students should be comfortable with probabilistic modeling and basic neural net architectures, but do not need to have any other prior knowledge of deep generative models.

Short bio

Phillip Isola is an associate professor in EECS at MIT. He studies computer vision, machine learning, and AI. He completed his Ph.D. in Brain & Cognitive Sciences at MIT, and has since spent time at UC Berkeley, OpenAI, and Google Research. Dr. Isola’s research has been recognized by a Google Faculty Research Award, the 2021 PAMI Young Researcher Award, a Packard Fellowship, and a Sloan Fellowship. His current research focuses on how to make artificial intelligence more flexible, general, and adaptive — that is, more like natural intelligence.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.