Zhiting Hu & Eric P. Xing

University of California, San Diego & Carnegie Mellon University

A “Standard Model” for Machine Learning with All Experiences [virtual]

Summary

In handling a wide variety of experience ranging from data instances, knowledge, constraints, to rewards, adversaries, and lifelong interaction in an ever-growing spectrum of tasks, contemporary Machine Learning (ML) and Artificial Intelligence (AI) research has resulted in a multitude of learning paradigms, optimization methods, and model architectures. Despite the continual progresses on all different fronts, the disparate narrowly focused methods also make standardized, composable, and reusable development of ML approaches difficult, and preclude the opportunity to build AI agents that panoramically learn from all types of experience, on diverse computing devices. The first part of the course covers a standardized ML formalism, in particular a ‘standard equation’ of the learning objective, that offers a unifying understanding of many important ML algorithms in the supervised, unsupervised, knowledge-constrained, reinforcement, adversarial, and online learning paradigms, respectively—those diverse algorithms are encompassed as special cases due to different choices of modeling components. The framework also provides guidance for mechanical design of new ML approaches and serves as a promising vehicle toward panoramic machine learning with all experience. The second part will focus on the system and scalability aspects of today’s ML. We’ll identify research and practical pain points in large model training and serving, and introduce new algorithmic techniques and system architectures for addressing the training and serving of popular big models, such as GPT-3, PaLM, and vision transformers.Through this part, we hope to lower the technical barrier of using big models in ML research and bring the big models to the masses.

Syllabus

Lecture 1: A ‘Standard Model’ of Machine Learning (I)
Lecture 2: A ‘Standard Model’ of Machine Learning (II)
Lecture 3: Techniques and Systems to Train and Serve Big Models

References

[1] Hu & Xing. “Toward a ‘Standard Model’ of Machine Learning”. Harvard Data Science Review. 2022. https://hdsr.mitpress.mit.edu/pub/zkib7xth/release/2

[2] Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning. OSDI’2022.

Pre-requisites

Introductory course on machine learning.

Short bios

Dr. Zhiting Hu is an Assistant Professor in Halicioglu Data Science Institute at UC San Diego. He received his Bachelor’s degree in Computer Science from Peking University in 2014, and his Ph.D. in Machine Learning from Carnegie Mellon University in 2020. His research interests lie in the broad area of machine learning, artificial intelligence, natural language processing, and ML systems. In particular, he is interested in principles, methodologies, and systems of training AI agents with all types of experience (data, symbolic knowledge, rewards, adversaries, lifelong interplay, etc), and their applications in controllable text generation, healthcare, and other application domains. His research was recognized with the best demo nomination at ACL2019 and an outstanding paper award at ACL2016.

Dr. Hao Zhang is a postdoc researcher at UC Berkeley. He completed his Ph.D. at Carnegie Mellon University. His research interests are in the intersection of machine learning and systems, with the focus on improving the performance and ease-of-use of today’s distributed ML systems. Hao’s research has been recognized with an NVIDIA pioneer research award at NeurIPS’17, and the Jay Lepreau best paper award at OSDI’21. Hao’s open-source artifacts have been used by organizations such as AI2, Meta, and Google, and parts of Hao’s research have been commercialized at multiple start-ups including Petuum and AnyScale.

Dr. Eric P. Xing is a Professor of Computer Science at Carnegie Mellon University, and the Founder, CEO, and Chief Scientist of Petuum Inc., a 2018 World Economic Forum Technology Pioneer company that builds standardized artificial intelligence development platform and operating system for broad and general industrial AI applications. He completed his undergraduate study at Tsinghua University, and holds a PhD in Molecular Biology and Biochemistry from the Rutgers University, and a PhD in Computer Science from UC Berkeley. His main research interests are the development of machine learning and statistical methodology, and large-scale computational systems and architectures, for solving problems involving automated learning, reasoning, and decision-making in high-dimensional, multimodal, and dynamic possible worlds in artificial, biological, and social systems. Prof. Xing currently serves or has served the following roles: associate editor of the Journal of the American Statistical Association (JASA), Annals of Applied Statistics (AOAS), IEEE Journal of Pattern Analysis and Machine Intelligence (PAMI) and the PLoS Journal of Computational Biology; action editor of the Machine Learning Journal (MLJ) and Journal of Machine Learning Research (JMLR); member of the United States Department of Defense Advanced Research Projects Agency (DARPA) Information Science and Technology (ISAT) advisory group. He is a recipient of the National Science Foundation (NSF) Career Award, the Alfred P. Sloan Research Fellowship, the United States Air Force Office of Scientific Research Young Investigator Award, the IBM Open Collaborative Research Faculty Award, as well as several best paper awards. Prof. Xing is a board member of the International Machine Learning Society; he has served as the Program Chair (2014) and General Chair (2019) of the International Conference of Machine Learning (ICML); he is also the Associate Department Head of the Machine Learning Department, founding director of the Center for Machine Learning and Health at Carnegie Mellon University; he is a Fellow of AAAI, IEEE, and ASA.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.