David McAllester

Toyota Technological Institute at Chicago

[intermediate/advanced] Information Theory for Deep Learning

Summary

Information theory is central to deep learning. Most fundamental is the cross entropy loss used in training classifiers. But generative adversarial networks (GANS) and variational autoencoders (VAES) are also defined by information-theoretic loss functions. Recently methods based contrastive predictive coding (CPC), motivated by the maximization of mutual information, has proved extremely effective in self-supervised training of image features. Even more recently more direct maximization of mutual information has proved more effective. This course will explore information theory form the perspective of current deep learning architectures with an emphasis on recent empirical results.

Syllabus

The definition of information and Shanon’s source coding theorem.
Generative Adversarial Networks (GANS) and Variational Auto Encoders (VAEs) from an information theoretic perspective.
Perils of differential entropy and the rise of discrete VAEs.
Mutual Information and self-supervised learning.
Formal limitations on the measurement of mutual information.

References

Cover and Thomas, Elements of Information Theory, Wiley Press
Goodfellow et al., Generative Adversarial Nets, NeurIPS, 2014
Kingma and Welling, Auto-encoding Variational Bayes, arXiv:1312.6114, 2013
Oord and Viyals, Representation Learning with Contrastive Predictive Coding, arXiv:1807.03748, 2018
Poole et al, On Variational Bounds of Mutual Information, arXiv:1905.06922, 2019
McAllester and Stratos, Formal Limitations on the Measurement of Mutual Information, AISTATS 2020

Pre-requisites

Vector Calculus. Familiarity with convex functions and Jensen’s inequality.

Short bio

David A. McAllester is Professor and former chief academic officer at the TTIC (the Toyota Technological Institute at Chicago). He received his B.S., M.S. and Ph.D. degrees from the Massachusetts Institute of Technology and has served on the faculties of Cornell and MIT. He was a member of technical staff at AT&T Labs-Research from 1995 to 2002 and has been a fellow of the American Association of Artificial Intelligence since 1997. He has written over 100 refereed publications. McAllester’s research areas include machine learning theory, the theory of programming languages, automated reasoning, AI planning, computer game playing (computer chess) and computational linguistics. A 1991 paper on AI planning proved to be one of the most influential papers of the decade in that area. A 1993 paper on computer game algorithms[6] influenced the design of the algorithms used in the Deep Blue chess system that defeated Garry Kasparov. A 1998 paper on machine learning theory introduced PAC-Bayesian theorems which combine Bayesian and non-Bayesian methods. He was a co-author of the deformable part model that dominated object detection in computer vision from 2008 to 2012. He has been teaching fundamentals of deep learning since 2016.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.