Amos Storkey
[intermediate] Meta-Learning and Contrastive Learning for Robust Representations
Summary
Given all the headlines about the wonders of machine learning, why is it so hard to actually get our chosen machine learning method to actually work well at deployment time? Real-world machine learning needs to work robustly in changing scenarios and multiple settings, often with little data for the specific situation. Yet even now, people rarely develop their model with this explicitly in mind.
On this course, we explore what we need to do to ensure our models work robustly across scenarios. This is especially important if we are deploying a model for use by multiple individuals or in multiple different settings, but is often just as important when we just need it to be robust to the real world. The course will look at the need for robust and adaptive machine learning, the causes of non-robustness, building datasets and evaluation approaches to ensure robust methods, hierarchical models and meta-learning to build robust and adaptable models and handling uncertainty within changing environments. We also consider the deploy, learn and adapt circle, and active learning approaches that make the best of the resources available. The course will consider many practical settings by way of example, including medical settings, automated control (e.g. self-driving) and machine learning on edge-devices.
Finally we will look at the potential directions the future may hold, including a more distributed approach to machine learning deployment that is different from the current centralised monolithic big-company dominance.
At the end of the course, attendees should be familiar with the foundations for building robust models, the practical business of achieving that in the real world and, for those who are interested, the potential new developments and research directions in this area.
Syllabus
- Causes of lack of robustness: dataset shift; collection bias; mismatch; over-curation; overfitting; non-adaptivity.
- Building datasets: be dirty; keep side information; go back in time; cover many demographics; deploy safely, collect and adapt.
- Building training and evaluation sets: side information is key; building hierarchies; multiple holdouts.
- Robustness theory: different sorts of generalisation; sources of uncertainty; acting under uncertainty; collection versus action; value of information.
- Methods and models: hierarchical methods in statistics; meta-learning as hierarchical models; the need for adaptation; model adaptation versus scenario assimilation; attention as adaptation.
- Resource constraints: efficiency without rigidity; to be Bayesian or to not be Bayesian.
- Practical considerations: know the cost; when things go wrong; local or central.
- Looking forward: data privacy, individualised settings, distributed adaptation, low-resource settings.
References
Hospedales, T. M., Antoniou, A., Micaelli, P. & Storkey, A. J. (2021). Meta-Learning in Neural Networks: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Storkey, A. J. (2009). When training and test sets are different: characterising learning transfer. In Dataset Shift in Machine Learning, Eds. Candela, Sugiyama, Schwaighofer, Lawrence.
Gelman, A. J. Hill (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models.
Bashkirova et al. (2021). VisDA-2021 Competition: Universal Domain Adaptation to Improve Performance on Out-of-Distribution Data. NeurIPS 2021.
Wang et al. (2019). Transferable Normalization: Towards Improving Transferability of Deep Neural Networks. NeurIPS 2019.
Pre-requisites
General machine learning knowledge. Experience of developing neural networks, ideally in real world settings.
Short bio
Amos Storkey is Professor of Machine Learning and AI at the School of Informatics, University of Edinburgh. He leads the Bayesian and Neural Systems Research Group and is Director of the EPSRC Centre for Doctoral Training in Data Science. On the methodological side, he is known for his contributions to meta-learning and few shot learning, efficient neural network design, reinforcement learning, dataset shift, and transactional mechanisms for machine learning. His general focus is machine learning for images and video; as part of that he has a long history of developments in medical imaging and efficient methods for robust and adaptive image understanding.