Zhiting Hu & Eric P. Xing
A “Standard Model” for Machine Learning with All Experiences [virtual]
Summary
In handling a wide variety of experience ranging from data instances, knowledge, constraints, to rewards, adversaries, and lifelong interaction in an ever-growing spectrum of tasks, contemporary Machine Learning (ML) and Artificial Intelligence (AI) research has resulted in a multitude of learning paradigms, optimization methods, and model architectures. Despite the continual progresses on all different fronts, the disparate narrowly focused methods also make standardized, composable, and reusable development of ML approaches difficult, and preclude the opportunity to build AI agents that panoramically learn from all types of experience, on diverse computing devices. The first part of the course covers a standardized ML formalism, in particular a ‘standard equation’ of the learning objective, that offers a unifying understanding of many important ML algorithms in the supervised, unsupervised, knowledge-constrained, reinforcement, adversarial, and online learning paradigms, respectively—those diverse algorithms are encompassed as special cases due to different choices of modeling components. The framework also provides guidance for mechanical design of new ML approaches and serves as a promising vehicle toward panoramic machine learning with all experience. The second part will focus on the system and scalability aspects of today’s ML. We’ll identify research and practical pain points in large model training and serving, and introduce new algorithmic techniques and system architectures for addressing the training and serving of popular big models, such as GPT-3, PaLM, and vision transformers.Through this part, we hope to lower the technical barrier of using big models in ML research and bring the big models to the masses.
Syllabus
- Lecture 1: A ‘Standard Model’ of Machine Learning (I)
- Lecture 2: A ‘Standard Model’ of Machine Learning (II)
- Lecture 3: Techniques and Systems to Train and Serve Big Models
References
[1] Hu & Xing. “Toward a ‘Standard Model’ of Machine Learning”. Harvard Data Science Review. 2022. https://hdsr.mitpress.mit.edu/pub/zkib7xth/release/2
[2] Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning. OSDI’2022.
Pre-requisites
Introductory course on machine learning.
Short bios
Dr. Zhiting Hu is an Assistant Professor in Halicioglu Data Science Institute at UC San Diego. He received his Bachelor’s degree in Computer Science from Peking University in 2014, and his Ph.D. in Machine Learning from Carnegie Mellon University in 2020. His research interests lie in the broad area of machine learning, artificial intelligence, natural language processing, and ML systems. In particular, he is interested in principles, methodologies, and systems of training AI agents with all types of experience (data, symbolic knowledge, rewards, adversaries, lifelong interplay, etc), and their applications in controllable text generation, healthcare, and other application domains. His research was recognized with the best demo nomination at ACL2019 and an outstanding paper award at ACL2016.
Dr. Hao Zhang is a postdoc researcher at UC Berkeley. He completed his Ph.D. at Carnegie Mellon University. His research interests are in the intersection of machine learning and systems, with the focus on improving the performance and ease-of-use of today’s distributed ML systems. Hao’s research has been recognized with an NVIDIA pioneer research award at NeurIPS’17, and the Jay Lepreau best paper award at OSDI’21. Hao’s open-source artifacts have been used by organizations such as AI2, Meta, and Google, and parts of Hao’s research have been commercialized at multiple start-ups including Petuum and AnyScale.
Dr. Eric P. Xing is a Professor of Computer Science at Carnegie Mellon University, and the Founder, CEO, and Chief Scientist of Petuum Inc., a 2018 World Economic Forum Technology Pioneer company that builds standardized artificial intelligence development platform and operating system for broad and general industrial AI applications. He completed his undergraduate study at Tsinghua University, and holds a PhD in Molecular Biology and Biochemistry from the Rutgers University, and a PhD in Computer Science from UC Berkeley. His main research interests are the development of machine learning and statistical methodology, and large-scale computational systems and architectures, for solving problems involving automated learning, reasoning, and decision-making in high-dimensional, multimodal, and dynamic possible worlds in artificial, biological, and social systems. Prof. Xing currently serves or has served the following roles: associate editor of the Journal of the American Statistical Association (JASA), Annals of Applied Statistics (AOAS), IEEE Journal of Pattern Analysis and Machine Intelligence (PAMI) and the PLoS Journal of Computational Biology; action editor of the Machine Learning Journal (MLJ) and Journal of Machine Learning Research (JMLR); member of the United States Department of Defense Advanced Research Projects Agency (DARPA) Information Science and Technology (ISAT) advisory group. He is a recipient of the National Science Foundation (NSF) Career Award, the Alfred P. Sloan Research Fellowship, the United States Air Force Office of Scientific Research Young Investigator Award, the IBM Open Collaborative Research Faculty Award, as well as several best paper awards. Prof. Xing is a board member of the International Machine Learning Society; he has served as the Program Chair (2014) and General Chair (2019) of the International Conference of Machine Learning (ICML); he is also the Associate Department Head of the Machine Learning Department, founding director of the Center for Machine Learning and Health at Carnegie Mellon University; he is a Fellow of AAAI, IEEE, and ASA.