Deepak Pathak
[intermediate/advanced] Continually Improving Agents for Generalization in the Wild
Summary
Current AI agents are incredibly specific in that they only perform the tasks or work well on the kind of data they are trained for, thus lacking the basic common-sense possessed by humans. One fundamental reason is that these systems are trained once and kept fixed at deployment, thus making it difficult to generalize to scenarios far from the ones seen in training data. This is in contrast to humans where generalization fundamentally stems from our ability to learn and adapt continuously throughout our lifetime.
In this course, I will discuss as to how can we train AI agents and robots that can perform thousands of tasks in thousands of environments, and begin to understand the world the way humans do. Unlike the progress in computer vision or natural language, the progress in robotics is bottlenecked because robot data does not exist on the internet, unlike images or text. This poses a chicken-and-egg problem for robotics: to train robots for generalization, we need large amounts of robotic data from diverse environments, but it is impractical to collect such data unless we can deploy robots that generalize. Passive human videos on the internet can help alleviate this issue by providing diverse scenarios to pretrain robotic skills. However, just watching humans is not enough, the robot needs to learn and improve by autonomously practicing in the real world and adapting its learning to new scenarios.
Inspired by prominent ideas in child development, we will study algorithms that allow AI agents to acquire knowledge and develop skills by continual exploration and adaptation bootstrapping from watching people. We will unify these three mechanisms — learning by watching others (social learning), practicing by exploration (curiosity), and adapting already learned skills in real-time (adaptation) — to define a continually adaptive robotic framework. I will demonstrate the potential of this framework for scaling up robot learning via case studies of controlling dextrous robotic hands from monocular vision, dynamic-legged robots walking from vision on unseen challenging hikes, and mobile manipulators performing lots of diverse manipulation tasks in the wild.
Syllabus
- Curiosity-driven Reinforcement Learning and Exploration for training virtual agents in the open world
- Rapid Motor Adaptation for Legged Robots
References
https://pathak22.github.io/noreward-rl/
https://human2robot.github.io/
https://robotic-telekinesis.github.io/
https://ashish-kmr.github.io/rma-legged-robots/
https://vision-locomotion.github.io/
https://manipulation-locomotion.github.io/
Pre-requisites
Just basics of Deep Learning background. Just basics of Reinforcement Learning.
Short bio
Deepak Pathak is a faculty in the School of Computer Science at Carnegie Mellon University. He received his Ph.D. from UC Berkeley and his research spans computer vision, machine learning, and robotics. He is a recipient of Okawa research award, IIT Kanpur Young Alumnus award, CoRL22 Paper Award and faculty awards from Google, Samsung, Sony and GoodAI. Deepak’s research has been featured in popular press outlets, including The Economist, The Wall Street Journal, Forbes, Quanta Magazine, Washington Post, CNET, Wired, and MIT Technology Review among others. Earlier, he received his Bachelor’s from IIT Kanpur with a Gold Medal in Computer Science. He co-founded VisageMap Inc., later acquired by FaceFirst Inc. Webpage: https://www.cs.cmu.edu/~dpathak/.