Michael Mahoney
[intermediate] Practical Neural Network Theory: From Statistical Mechanics Basics to Working with State of the Art Models
Summary
The presentation will cover empirical results on state of the art neural network models in computer vision, natural language processing, and related areas; statistical mechanics based phenomenological theory based on these empirical results; how this theory can be used to make predictions on state of the art models, including, e.g., how one can predict trends in the quality of state of the art models without even access to training and testing data, how it can be used to perform model diagnostics, and how it can be used to improve training. It will also cover how these approaches relate to traditional machine learning theoretical approaches to neural networks and how they can help to bridge the large gap between theory and practice in the area.
Syllabus
A starting point is provided by https://www.stat.berkeley.edu/~mmahoney/talks/dnn_kdd19_fin.pdf
which presents a much less mature version of the topics to be covered.
References
Several, including:
https://www.stat.berkeley.edu/~mmahoney/pubs/predicting-trends-NatCom21.pdf
https://www.stat.berkeley.edu/~mmahoney/pubs/htsr_20-410.pdf
Pre-requisites
Basic knowledge of machine learning, e.g., at the graduate student level. Knowledge of statistical mechanics will not hurt, but it will not be assumed. Knowledge of machine learning theory will be helpful but not essential.
Short bio
Michael W. Mahoney is at the University of California at Berkeley in the Department of Statistics and at the International Computer Science Institute (ICSI). He is also an Amazon Scholar as well as head of the Machine Learning and Analytics Group at the Lawrence Berkeley National Laboratory. He works on algorithmic and statistical aspects of modern large-scale data analysis. Much of his recent research has focused on large-scale machine learning, including randomized matrix algorithms and randomized numerical linear algebra, scalable stochastic optimization, geometric network analysis tools for structure extraction in large informatics graphs, scalable implicit regularization methods, computational methods for neural network analysis, physics informed machine learning, and applications in genetics, astronomy, medical imaging, social network analysis, and internet data analysis. He received his PhD from Yale University with a dissertation in computational statistical mechanics, and he has worked and taught at Yale University in the mathematics department, at Yahoo Research, and at Stanford University in the mathematics department. Among other things, he was on the national advisory committee of the Statistical and Applied Mathematical Sciences Institute (SAMSI), he was on the National Research Council’s Committee on the Analysis of Massive Data, he co-organized the Simons Institute’s fall 2013 and 2018 programs on the foundations of data science, he ran the Park City Mathematics Institute’s 2016 PCMI Summer Session on The Mathematics of Data, he ran the biennial MMDS Workshops on Algorithms for Modern Massive Data Sets, and he was the Director of the NSF/TRIPODS-funded FODA (Foundations of Data Analysis) Institute at UC Berkeley. More information is available at https://www.stat.berkeley.edu/~mmahoney/.