
Evan Shelhamer
[intermediate] Test-Time Adaptation for Updating on New and Different Data
Summary
Will every issue be solved by more data, parameters, and training? What if the world changes?
Shift happens and shift hurts: when the data at test time differs from the data at train time, this change or shift can degrade accuracy, calibration, and more. This course reviews train-time adaptation like fine-tuning then covers test-time adaptation: how to update on new and different data during deployment. Test-time adaptation updates on the test data alone for better generalization without auxiliary data, annotations, or excessive computation. We will tour test-time updates to statistics, parameters, inputs, and outputs to highlight the variety of test-time adaptation methods and discuss which to use when. Then we will finish at the frontier of test-time adaptation today and identify opportunities and obstacles for the next steps for the topic.
Syllabus
1. Train-Time Adaptation: Learning, Transfer, and Adaptation
- Source and Target Data and their Differences
- Learning from Supervision and Self-Supervision
- Transfer by Fine-Tuning or Reuse
- Adaptation by Training on Source and Target data
2. Test-Time Adaptation: Losses, Parameters, and Updates
- The Need for Test-Time Updates: Accuracy, Efficiency, Practicality
- Updates to Statistics, Parameters, Inputs, and Outputs
- Confidence and Entropy Minimization
- Self-Supervision and Auxiliary Tasks
- Generation for Recognition with Diffusion
- Adjusting Outputs
3. More Shifts, More Models, More Steps!
- Combined, Continual, Open-World, and Adversarial Shifts
- Updating with Multiple Models by Ensembling, Soups, and Mixtures
- Evaluating and Tuning Adaptation
- Next Steps for Test-Time Updates in Research, Engineering, and Applications
References
Main:
- Updating by Entropy Minimization and the TTA setting: Tent. Wang* and Shelhamer* et al. ICLR’21.
- Updating by Self-Supervision and the TTT setting: TTT by self-supervision. Sun et al. ICML’20.
- Updating Statistics: BN. Schneider et al. NeurIPS’20.
- Updating Outputs: T3A. Iwasawa and Matsuo. NeurIPS’21. LAME. Boudiaf et al. CVPR’22.
- Updating by Diffusion: DiffPure. Nie et al. ICML’22. DDA. Gao* and Zhang* et al. CVPR’23. Diffusion-TTA. Prabhudesai et al. NeurIPS’23.
Multi-Modeling:
- Ensembles. Dietterich. MCS’00.
- Deep Ensembles. Lakshminarayanan et al. NeurIPS’17.
- Model soups. Wortsman et al. ICML’22.
- Seasoning Model Soups. Croce et al. CVPR’23.
Evaluation:
- Evaluating Adaptive Test-Time Defenses. Croce*, Gowal*, Brunner*, Shelhamer* et al. ICML’22.
- Better Practices for Domain Adaptation. Ericsson et al. AutoML’23.
More:
Training and Transfer:
- Fine-tuning and Feature Reuse: DeCAF. Donahue* and Jia* et al. ICML’14.
- How Transferable are Features? Yosinski et al. NeurIPS’14.
- Head2Toe. Evci et al. ICML’22.
- Surgical Fine-Tuning. Lee* and Chen* et al. ICLR’23.
- Parameter Efficient Fine-Tuning: LoRA. Hu* and Shen* et al. ICLR’22. Adapters. Houlsby et al. ICML’19.
More TTA:
- Survey of TTA. Liang et al. IJCV’25.
- MEMO. Zhang et al. NeurIPS’22.
- EATA. Niu*, Wu*, Zhang* et al. ICML’22.
- CoTTA. Wang et al. CVPR’22.
- NOTE. Gong et al. NeurIPS’22.
- RoTTA. Yuan et al. CVPR’23.
- RDumb. Press et al. NeurIPS’23.
- SAR. Niu*, Wu*, Zhang* et al. ICLR’23.
- PeTTA. Hoang et al. NeurIPS’24.
- FOA. Niu et al. ICML’24.
More TTT:
- TTT+. Liu et al. NeurIPS’21.
- TTT-MAE. Gandelsman* and Sun* et al. NeurIPS’22.
- TTT for RL. Hansen et al. ICLR’21.
- TTT for LLMs. Hardt and Sun. ICLR’24.
More Diffusion:
Domain Generalization:
- Survey of Domain Generalization. Zhou et al. PAMI’23.
- Residual Adapters. Rebuffi et al. NeurIPS’17.
Pre-requisites
Basic knowledge of machine learning and familiarity with deep learning (for example: nonlinearities like the ReLU, convolutional / recurrent / attentional networks, and optimization by stochastic gradient descent).
Short bio
Evan Shelhamer is an assistant professor at UBC in Vancouver, member of the Vector Institute, and senior research scientist at Google DeepMind. His research is on visual recognition, self-supervised learning without annotations, and robustness by adaptation during deployment. He earned his PhD at UC Berkeley advised by Prof. Trevor Darrell. He was the lead developer of the Caffe deep learning framework from version 0.1 to 1.0. His research and service have received awards including the best paper honorable mention at CVPR’15 for fully convolutional networks and the Mark Everingham award at ICCV’17, the open-source award at MM’14, and the test-of-time award at MM’24 for Caffe. He organized the 1st workshop on test-time adaptation at CVPR’24 and looks forward to the next steps for test-time updates!