DeepLearn 2025
12th International School on Deep Learning
(with a special focus on Large Language Models, Foundation Models and Generative AI)
Porto - Maia, Portugal · July 21-25, 2025
Registration
Downloads
  • Call DeepLearn 2025
  • Poster DeepLearn 2025
  • Lecture Materials
  • Home
  • Lecturers
  • Schedule
  • Sponsors
  • News
  • Info
    • Travel
    • Accommodation
    • UMaia / UP students and staff
    • Visa
    • Code of conduct
    • Testimonials
  • Home
  • Lecturers
  • Schedule
  • Sponsors
  • News
  • Info
    • Travel
    • Accommodation
    • UMaia / UP students and staff
    • Visa
    • Code of conduct
    • Testimonials
Yingbin Liang

Yingbin Liang

Ohio State University

[intermediate/advanced] Theory on Training Dynamics of Transformers

Summary

Transformers, as foundation models, have recently revolutionized many machine learning (ML) applications. Alongside their tremendous experimental successes, theoretical studies have also emerged to explain why transformers can be trained to achieve fantastic performance. This tutorial aims to provide an overview of these recent theoretical investigations that have characterized the training dynamics of transformer-based ML models. Additionally, the tutorial will explain the primary techniques and tools employed for such analyses, which leverage various information theoretical concepts and tools in addition to learning theory, stochastic optimization, dynamical systems, probability, etc.

Syllabus

The tutorial will begin with an introduction to basic transformer models, and then delve into several ML problems where transformers have found extensive applications, such as in-context learning, next token prediction, and self-supervised learning. For each learning problem, the tutorial will go over the problem formulation, the main theoretical techniques for characterizing the training process, the convergence guarantee and the optimality of the attention models at the time of convergence, the implications to the learning problem, and the insights and guidelines to practical solutions. Finally, the tutorial will discuss future directions and open problems in this actively evolving field.

References

Yu Huang, Yuan Cheng, Yingbin Liang. “In-context convergence of transformers”, Proc. International Conference on Machine Learning (ICML), 2024.

Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi. “In-context learning with representations: Contextual generalization of trained transformers”, Proc. Advances in Neural Information Processing Systems (NeurIPS), 2024.

Ruiquan Huang, Yingbin Liang, Jing Yang. “Non-asymptotic convergence of training transformers for next-token prediction”, Proc. Advances in Neural Information Processing Systems (NeurIPS), 2024.

Yu Huang, Zixin Wen, Yuejie Chi, Yingbin Liang. “How transformers learn diverse attention correlations in masked vision pretraining”, arXiv 2403.02233, 2024.

Pre-requisites

Basics of deep learning, language model (preferred), basics of optimization, probability theory.

Short bio

Dr. Yingbin Liang is currently a Professor at the Department of Electrical and Computer Engineering at the Ohio State University (OSU), and a core faculty of the Ohio State Translational Data Analytics Institute (TDAI). She also serves as the Deputy Director of the AI-EDGE Institute at OSU. Dr. Liang received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005, and served on the faculty of University of Hawaii and Syracuse University before she joined OSU. Dr. Liang’s research interests include machine learning, optimization, information theory, and statistical signal processing. Dr. Liang received the National Science Foundation CAREER Award and the State of Hawaii Governor Innovation Award in 2009. She also received EURASIP Best Paper Award in 2014. She is an IEEE fellow.

Other Courses

YoninaEldar2023Yonina Eldar
Manuela VelosoManuela Veloso
Pierre BaldiPierre Baldi
Sean BensonSean Benson
Xavier BressonXavier Bresson
Nello CristianiniNello Cristianini
Mark DerdzinskiMark Derdzinski
Samira Ebrahimi KahouSamira Ebrahimi Kahou
Elena GiusarmaElena Giusarma
Shih-Chieh HsuShih-Chieh Hsu
Xia (Ben) HuXia “Ben” Hu
Lu JiangLu Jiang
Jayashree Kalpathy-CramerJayashree Kalpathy-Cramer
Chen Change LoyChen Change Loy
Fenglong MaFenglong Ma & Cao (Danica) Xiao
Evan ShelhamerEvan Shelhamer
deeeplearn-speakers-wangAtlas Wang
Xiang WangXiang Wang
Rex YingRex Ying

DeepLearn 2025

CO-ORGANIZERS

University of Maia

Institute for Research Development, Training and Advice – IRDTA, Brussels/London

Active links
  • CESArtIn 2026
Past links
  • DeepLearn 2024
  • DeepLearn 2023 Summer
  • DeepLearn 2023 Spring
  • DeepLearn 2023 Winter
  • DeepLearn 2022 Autumn
  • DeepLearn 2022 Summer
  • DeepLearn 2022 Spring
  • DeepLearn 2021 Summer
  • DeepLearn 2019
  • DeepLearn 2018
  • DeepLearn 2017
© IRDTA 2024. All Rights Reserved.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
Cookie SettingsAccept All
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-advertisement1 yearThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSIDsessionThis cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
CookieDurationDescription
_ga2 yearsThis cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_91 minuteThis cookie is set by Google and is used to distinguish users.
_gid1 dayThis cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT
Powered by CookieYes Logo