Yingbin Liang

Ohio State University

[intermediate/advanced] Bilevel Optimization and Applications in Deep Learning

Summary

Many modern machine learning (ML) problems such as meta-learning, hyperparameter optimization, neural architecture search, and reinforcement learning naturally exhibit bilevel optimization (BO) structure, where the inner problem solves an optimization problem to a certain extent, which then serves as part of an ultimate objective to be further optimized in an outer problem. Thus, BO has arisen as a powerful paradigm for providing principled algorithm design and performance characterization for bilevel ML problems. Extensive interest has been inspired recently in advancing BO algorithms and further leveraging these techniques to improve the efficiency and scalability of bilevel deep learning. This lecture aims to introduce the basic concept and algorithm design principles of BO, and present recent research advances of BO and their applications in several major bilevel ML problems.

Specifically, we will first introduce the formulation of BO and the types of ML problems that BO can model. We will then introduce several popular BO algorithms, including AID and ITD type algorithms and their stochastic variants, and provide the performance comparison among these algorithms with respect to the convergence rate, computational cost, and scalability. We will further discuss several important implementation issues such as the impact of loops, second-order computations, and Hessian-free design, and how they will affect the performance of BO algorithms in deep learning. We will then present the applications of BO algorithms in meta-learning, hyperparameter optimization, and representation learning, and the experimental validations of these algorithms. We will finally conclude the talk with remarks on open problems and future directions.

Syllabus

Introduction of bilevel optimization and applications in ML
Bilevel optimization algorithms and performance
Implementation issues in deep learning
Application examples: meta-learning, hyperparameter optimization, etc.

References

Kaiyi Ji, Junjie Yang, Yingbin Liang. “Bilevel optimization: Convergence analysis and enhanced design”, Proc. International Conference on Machine Learning (ICML), 2021.

Junjie Yang, Kaiyi Ji, Yingbin Liang. “Provably faster algorithms for bilevel optimization”, Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021.

Daouda Sow, Kaiyi Ji, Yingbin Liang. “On the convergence theory for Hessian-free bilevel algorithms”, Proc. Advances in Neural Information Processing Systems (NeurIPS), 2022.

Kaiyi Ji, Jason D. Lee, Yingbin Liang, H. Vincent Poor. “Convergence of meta-learning with task-specific adaptation over partial parameters”, Proc. Advances in Neural Information Processing Systems (NeurIPS), 2020.

Ankur Sinha, Pekka Malo, and Kalyanmoy Deb. “A review on bilevel optimization: from classical to evolutionary approaches and applications”, IEEE Transactions on Evolutionary Computation, 22(2):276–295, 2017.

Pre-requisites

Familiarity with the optimization concept, (stochastic) gradient descent approaches. Knowledge of basic machine learning problems such as classification, meta-learning.

Short bio

Dr. Yingbin Liang is currently a Professor at the Department of Electrical and Computer Engineering at The Ohio State University (OSU), and a core faculty of the Ohio State Translational Data Analytics Institute (TDAI). She also serves as the Deputy Director of the AI-EDGE Institute at OSU. Dr. Liang received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005, and served on the faculty of University of Hawaii and Syracuse University before she joined OSU. Dr. Liang’s research interests include machine learning, optimization, information theory, and statistical signal processing. Dr. Liang received the National Science Foundation CAREER Award and the State of Hawaii Governor Innovation Award in 2009. She also received the EURASIP Best Paper Award in 2014. She is an IEEE fellow.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.