Cho-Jui Hsieh

University of California Los Angeles

[intermediate/advanced] Optimizers for Large Language Model Training

Summary

These lectures will cover the theoretical foundations and practical algorithms of widely used deep-learning optimizers, along with the key challenges of large-scale model training.

Syllabus

Introduction to (continuous) optimization
Gradient descent and stochastic gradient descent
Adaptive optimizers and momentum
Second-order optimizers
Muon optimizer
Distributed and large batch training
Scale-invariance optimizers for LLM finetuning
Challenges in large-scale LLM training

References

Pre-requisites

Calculus. Linear Algebra. Mathematical analysis. Machine Learning.

Short bio

Cho-Jui Hsieh is an associate professor in the Computer Science Department at UCLA. His work primarily focuses on enhancing the efficiency and robustness of machine learning systems, and he has made significant contributions to multiple widely-used machine learning packages. He has been honored with the NSF Career Award, Samsung AI Researcher of the Year, and Google Research Scholar Award, and his work has been acknowledged with several paper awards in ICLR, KDD, ICDM, ICPP, and SC.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.