Arthur Gretton
[intermediate/advanced] Probability Divergences and Generative Models
Summary
Probability divergences are at the heart of much modern machine learning, from training generative adversarial networks, to obtaining disentangled representations of complex scenes, to self-supervised learning. We will introduce two major classes of probability divergences: the integral probability metrics, and phi (or f-) divergences. We then go on to apply these divergences in machine learning settings. Our first application will be in two-sample testing, where we determine whether two samples are from the same distribution: this is a helpful diagnostic when evaluating whether the dataset used to train a model is from the same distribution as the one on which it is being deployed. Our second application will be in training generative adversarial networks, where the divergence serves as a critic function. We will go on to explore some advanced applications of divergences: measuring and testing statistical dependence, and evaluating goodness-of-fit for probabilistic models.
Syllabus
- Introduction to probability divergences: phi (f-) divergences and integral probability metrics
- A deep dive into integral probability metrics: varieties of IPM, with emphasis on the maximum mean discrepancy (MMD)
- MMD for two-sample testing, using learned neural net features: application to testing CIFAR10 vs CIFAR10.1
- Probability divergences as critic functions in a generative adversarial network. Generalised energy-based models
- Advanced topics: measuring and testing statistical dependence, evaluating goodness-of-fit for probabilistic models
References
Maximum mean discrepancy and two-sample testing:
https://jmlr.csail.mit.edu/papers/v13/gretton12a.html
https://arxiv.org/abs/2002.09116
GANs and generalized energy-based models:
https://arxiv.org/abs/1606.00709
https://arxiv.org/abs/2003.05033
Evaluating statistical dependence:
https://papers.nips.cc/paper/2007/hash/d5cfead94f5350c12c322b5b664544c1-Abstract.html
https://arxiv.org/abs/2106.08320
Evaluating model goodness-of-fit:
https://arxiv.org/abs/1602.02964
Pre-requisites
Linear algebra and statistics, ideally at an advanced undergraduate level or better.
Short bio
Arthur Gretton is a Professor with the Gatsby Computational Neuroscience Unit, and director of the Centre for Computational Statistics and Machine Learning (CSML) at UCL. He received degrees in Physics and Systems Engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He previously worked at the MPI for Biological Cybernetics, and at the Machine Learning Department, Carnegie Mellon University. Arthur’s recent research interests in machine learning include the design and training of generative models, both implicit (e.g. GANs) and explicit (exponential family and energy-based models), causal modeling, and nonparametric hypothesis testing. Arthur was a program chair for AISTATS in 2016, a tutorials chair for ICML 2018, a workshops chair for ICML 2019, a program chair for the Dali workshop in 2019, and an organiser of the Machine Learning Summer School 2019 in London.