Michalis Vazirgiannis

École Polytechnique

[intermediate/advanced] Graph Machine Learning and Multimodal Graph Generative AI

Summary

Graphs are abundant and dominate a large group of applications as natural data structures representing more effectively the knowledge inherent in data. In this mini-course we will present initially the challenge of Graph Machine Learning (GML), the concept of graph similarity via graph kernels with applications such as community detection and impact evaluation. Also an introduction to the GraKeL Python library enabling machine learning with kernels. Then we move to deep learning for graphs with a mild introduction of Graph Neural Networks with node and graph embedding algorithms along with relevant experimental evaluation tasks. We put emphasis on the message passing GNN model but also present alternatives such as the RWNN — enabling explainability, Hyperbolic Graph Neural Networks for better hierarchical relations treatment etc. We also present an introduction to the topic of Graph Generative AI with deep neural nets and LLMs and applications to biomedical and generation molecule domains. We also present the topic of multimodality for graphs.

Syllabus

1. Graph Similarity

Graph kernels, community detection
GraKeL Python library — https://github.com/ysig/GraKeL/

2. Deep Learning for Graphs — node classification

Node embeddings (deepwalk & node2vec) for node classification and link prediction
Supervised node embeddings (GCNN, …)

3. Deep Learning for Graphs — graph classification

Graph CNNs
Message passing, popular GNNs

4. Applications of GNNs

Natural language — document understanding
Bio/medical (ARG prediction)
Time series predictions

5. Generative and pretrained models for graphs

Graph generative models
Generative models for medical graphs
Multi modality for graph generators — protein function text generator, text/mol
Graph LLMs — how LLMs can generate graphs

References

Graph similarity

Graph kernels: A survey, G. Nikolentzos, G. Siglidis, M. Vazirgiannis, Journal of Artificial Intelligence Research 72, 943-1027.
Learning Structural Node Representations on Directed Graphs, N. Steenfatt, G. Nikolentzos, M. Vazirgiannis, Q. Zhao, International Conference on Complex Networks and their Applications, 132-144.
Classifying graphs as images with convolutional neural networks, A.J.P. Tixier, G. Nikolentzos, P. Meladianos, M. Vazirgiannis, ICANN 2018.

Deep learning for graphs

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. The Graph Neural Network Model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009.
Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel. Gated Graph Sequence Neural Networks, 2017 https://arxiv.org/abs/1511.05493
J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th ICML conference, 1263–1272, 2017.
M. Zhang, Z. Cui, M. Neumann, and Y. Chen. An End-to-End Deep Learning Architecture for Graph Classification. In Proceedings of the 32nd AAAI conference, 4438–4445, 2018.
C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks. In Proceedings of the 33rd AAAI conference, 2019.
K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How Powerful are Graph Neural Networks? In Proceedings of the 7th International Conference on Learning Representations, 2019.
Yaguang Li, Rose Yu, Cyrus Shahabi and Yan Liu, Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting, https://arxiv.org/pdf/1707.01926.pdf
Defu Cao, Yujing Wang, Juanyong Duan, Ce Zhang, Xia Zhu, Conguri Huang, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong and Qi Zhang, Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting, https://arxiv.org/pdf/2103.07719.pdf

Generative and pretrained models for graphs

Yuanfu Lu, Xunqiang Jiang, Yuan Fang, Chuan Shi, Learning to Pre-train Graph Neural Networks, Proceedings of AAAI conference 2021.
Aymen Qabel, Sofiane Ennadir, Giannis Nikolentzos, Johannes F. Lutzeyer, Michail Chatzianastasis, Henrik Boström, Michalis Vazirgiannis, Structure-Aware Antibiotic Resistance Classification Using Graph Neural Networks, https://www.biorxiv.org/content/biorxiv/early/2022/10/08/2022.10.06.511103.full.pdf
Giannis Nikolentzos, Michalis Vazirgiannis, Christos Xypolopoulos, Markus Lingman, Erik G. Brandt, Synthetic electronic health records generated with variational graph autoencoders, https://www.medrxiv.org/content/10.1101/2022.10.17.22281145v1
Hadi Abdine, Michail Chatzianastasis, Costas Bouyioukos, and Michalis Vazirgiannis, Prot2text: Multimodal protein’s function generation with gnns and transformers. In Proceedings of AAAI conference 2024.
Zhikai Chen, Haitao Mao, Hang Li, Wei Jin, Hongzhi Wen, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Wenqi Fan, Hui Liu, et al., Exploring the potential of large language models (llms) in learning on graphs, ACM SIGKDD Explorations Newsletter, 25(2):42–61, 2024.
Dimitrios Christofidellis, Giorgio Giannone, Jannis Born, Ole Winther, Teodoro Laino, and Matteo Manica, Unifying molecular and textual representations via multi-task language modelling. In International Conference on Machine Learning, 6140–6157. PMLR, 2023
Iakovos Evdaimon, Giannis Nikolentzos, Michail Chatzianastasis, Hadi Abdine, and Michalis Vazirgiannis, Neural graph generator: Feature-conditioned graph generation using latent diffusion models, 2024. https://arxiv.org/abs/2403.01535
Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh V Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, and Bryan Hooi, G-retriever: Retrieval-augmented generation for textual graph understanding and question answering, 2024. https://arxiv.org/abs/2402.07630
Bahare Fatemi, Jonathan Halcrow, and Bryan Perozzi, Talk like a graph: Encoding graphs for large language models. In Proceedings of the Twelfth International Conference on Learning Representations, 2024.

Pre-requisites

Good understanding of algorithms, graphs and machine and deep learning.

Short bio

Dr. Vazirgiannis is a Distinguished Professor at École Polytechnique, Institute Polytechnique de Paris in France. He has conducted research in Fraunhofer and Max Planck-MPI (Germany), and in INRIA/FUTURS (Paris). He has been teaching data mining, machine and deep learning and NLP/LLMs in AUEB (Greece), École Polytechnique, Telecom-Paristech, ENS (France), Jiaotong Shanghai (China), Deusto University (Spain), MBZUAI (UAE), UM6P and Centrale (Morocco). His current research interests are on graph machine/deep learning, including GNNs, community detection, graph classification, clustering and embeddings, influence maximization, and NLP/LLMs. Recently, he is interested in multimodal pretrained models for downstream and generative tasks. Also he has long experience in text mining including graph of words, deep learning for NLP tasks and applications such as digital marketing, event detection and summarization. He has active cooperations with industrial partners in the area of data analytics and machine learning for large scale data repositories in different application domains. He has supervised more than 25 completed PhD theses, has published 3 books and more than 280 papers in international refereed journals and conferences and received best paper (or mention) awards in ACM CIKM 2013 and IJCAI 2018, and ICWSM 2020. He has organized large scale conferences in the area of data mining and machine learning (such as ECML/PKDD 2011) while he participated in the senior PC of AI and ML conferences – such as AAAI and IJCAI. He has been invited for talks and keynote speeches recently in the Webconf 2023 (Austin, TX), the ICBS 2023 conference (Beijing), and teaches in international schools like DeepLearn 2022-2024 and the International Winter School on Generative AI 2024. He has received ERCIM and Marie Curie EU fellowships, the Rhino-Bird International Academic Expert Award by Tencent and he leads important chairs such as: AXA Data Science (2015- 2018), the ANR-HELAS (2020-2025) and WASP/KTH (2020-2025). More information at the DASCIM web page: http://www.lix.polytechnique.fr/dascim and Google Scholar profile: https://bit.ly/2rwmvQU

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.