Wojciech Samek

Fraunhofer Heinrich Hertz Institute / Technical University of Berlin

[introductory/intermediate] From Feature Attributions to Next-Generation Explainable AI

Summary

The domain of Explainable Artificial Intelligence (XAI) has made significant strides in recent years. Various explanation techniques have been devised, each serving distinct purposes. Some of them explain individual predictions of AI models by highlighting influential input features, while others enhance comprehension of the model’s internal operations by visualizing the concepts encoded by individual neurons. These initial XAI techniques have proven valuable in scrutinizing models and detecting flawed prediction strategies (referred to as “Clever Hans” behaviors). This tutorial will give a structured overview of the prominent approaches in XAI and will discuss next-generation techniques that provide more human-understandable and actionable explanations, thus deliver maximum usefulness in real-world scenarios. Additionally, the advancement of generative AI, notably the emergence of exceedingly large language models (LLMs), has underscored the necessity for next-generation explanation methodologies tailored to this fundamentally distinct category of models and challenges. This tutorial will address this necessity from various angles and discuss recent methodological breakthroughs, which allow to gain deeper insights into the mysterious world of LLMs.

Syllabus

The first part of the tutorial will discuss “classical” XAI techniques, their applications and theoretical unterpinnings, as well as challenges and misconceptions, which were common during the first wave of explainable AI research. The second part will focus on more recent developments in the field. In particular, we will discuss some next-generation XAI methods, which provide more complete, more human-understandable and more actionable explanations, thereby enabling the expert user to systematically understand, debug and improve his or her AI model. The last part will present recent developments around XAI for Foundation Models.

The topics covered are:

Motivations: Black-box models and the “Clever Hans” effect
Classical Explainable AI: Concepts, methods & applications
Challenges and Common Misconceptions in XAI
Next-generation XAI methods: From feature attribution to concept-level, human-understandable and actionable explanations
XAI-based model debugging & improvement
XAI and Foundation Models

References

http://www.heatmapping.org

W Samek, G Montavon, S Lapuschkin, C Anders, KR Müller. Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications, Proceedings of the IEEE, 109(3):247-278, 2021.
https://doi.org/10.1109/JPROC.2021.3060483

Luca Longo, Mario Brcic, Federico Cabitza, Jaesik Choi, Roberto Confalonieri, Javier Del Ser, Riccardo Guidotti, Yoichi Hayashi, Francisco Herrera, Andreas Holzinger, Richard Jiang, Hassan Khosravi, Freddy Lecue, Gianclaudio Malgieri, Andrés Páez, Wojciech Samek, Johannes Schneider, Timo Speith, Simone Stumpf:
Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions
Information Fusion, 106:102301, 2024
https://arxiv.org/abs/2310.19775

Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Aakriti Jain, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek:
AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers
arXiv:2402.05602, 2024
http://arxiv.org/abs/2402.05602

Maximilian Dreyer, Reduan Achtibat, Wojciech Samek, Sebastian Lapuschkin:
Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations
arXiv:2311.16681, 2023
https://arxiv.org/abs/2311.16681

Johanna Vielhaben, Sebastian Lapuschkin, Grégoire Montavon, Wojciech Samek:
Explainable AI for Time Series via Virtual Inspection Layers
Pattern Recognition, 150:110309, 2024
https://doi.org/10.1016/j.patcog.2024.110309

Maximilian Dreyer, Frederik Pahde, Christopher J. Anders, Wojciech Samek, Sebastian Lapuschkin:
From Hope to Safety: Unlearning Biases of Deep Models via Gradient Penalization in Latent Space
Proceedings of the Thirty-Eight AAAI Conference on Artificial Intelligence, 2024
https://arxiv.org/abs/2308.09437

Frederik Pahde, Maximilian Dreyer, Wojciech Samek, Sebastian Lapuschkin:
Reveal to Revise: An Explainable AI Life Cycle for Iterative Bias Correction of Deep Models
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. LNCS, 14221:596-606, Springer, Cham, 2023
https://doi.org/10.1007/978-3-031-43895-0_56

C Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1, 206–215 (2019). https://doi.org/10.1038/s42256-019-0048-x

S M Lundberg, G Erion, H Chen et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2, 56–67 (2020).
https://doi.org/10.1038/s42256-019-0138-9

F Doshi-Velez, B Kim. Towards A Rigorous Science of Interpretable Machine Learning. arXiv:1702.08608.
https://arxiv.org/abs/1702.08608

P Schramowski, W Stammer, S Teso, S. et al. Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nat Mach Intell 2, 476–486 (2020).
https://doi.org/10.1038/s42256-020-0212-3

Pre-requisites

Basic understanding of machine learning and deep learning.

Short bio

Wojciech Samek is a Professor in the EECS Department at TU Berlin and is jointly heading the AI Department at Fraunhofer HHI. He is a Fellow at BIFOLD – Berlin Institute for the Foundation of Learning and Data, the ELLIS Unit Berlin, and the DFG Research Unit DeSBi. Furthermore, he is a Senior Editor for IEEE TNNLS, an Associate Editor for Pattern Recognition, and an elected member of the IEEE MLSP Technical Committee and the Germany’s Platform for AI. He has co-authored more than 200 papers and was leading editor of the Springer book “Explainable AI: Interpreting, Explaining and Visualizing Deep Learning” (2019), and co-editor of the open access Springer book “xxAI – Beyond explainable AI” (2022). He has served as Program Co-Chair for IEEE MLSP’23, and as Area Chair for NAACL’21 and NeurIPS’23, and is a recipient of multiple best paper awards, including the 2020 Pattern Recognition Best Paper Award and the 2022 Digital Signal Processing Best Paper Prize.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.