Michalis Vazirgiannis

École Polytechnique

[intermediate/advanced] Machine Learning with Graphs and Applications

Summary

Graphs are abundant and dominate a large group of applications as natural data structures representing more effectively the knowledge inherent in data. In this tutorial we will present initially the concept of graph similarity via graph kernels with applications such as community detection and impact evaluation. Also an introduction to the Grakel python library enabling machine learning with kernels. Then we move to deep learning for graphs with a mild introduction of Graph Neural Networks with node and graph embedding algorithms along with relevant experimental evaluation tasks. We put emphasis on the message passing GNN model but also present alternatives such as the RWNN enabling explainability by learning graphlets as features contributing to understanding better GNNS. Finally we present novel applications of GNNs in the areas of image recognition and time series prediction.

Syllabus

1. Graph Similarity

Graph kernels, community detection
Grakel python library – https://github.com/ysig/GraKeL/tree/develop

2. Deep Learning for Graphs – node classification

Node embeddings (deepwalk & node2vec) for node classification and link prediction
Supervised node embeddings (GCNN, …)

3. Deep Learning for Graphs – Graph classification

Graph CNNs
Message passing
Graph – Auto-encoders

4. Expressivity/explainability – RW-GNNs

Network Architecture Search – interpretability

5. Applications of GNNs: NLP, images, bio/medical (alpha fold, pharma etc)

References

1. Learning Structural Node Representations on Directed Graphs, N. Steenfatt, G. Nikolentzos, M. Vazirgiannis, Q. Zhao, International Conference on Complex Networks and their Applications, 132-144.
2. Classifying graphs as images with convolutional neural networks, A.J.P. Tixier, G. Nikolentzos, P. Meladianos, M. Vazirgiannis, arXiv preprint arXiv:1708.02218.

Deep learning NLP references

3. Filippova, K. (2010, August). Multi-sentence compression: Finding shortest paths in word graphs. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 322-330). Association for Computational Linguistics.
4. LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., & Huang, F. (2006). A tutorial on energybased learning. Predicting structured data, 1(0).
5. Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.
6. Hoffer, E., & Ailon, N. (2015, October). Deep metric learning using triplet network. In International Workshop on Similarity-Based Pattern Recognition (pp. 84-92). Springer, Cham.
7. Mueller, Jonas, and Aditya Thyagarajan. Siamese recurrent architectures for learning sentence similarity. Thirtieth AAAI Conference on Artificial Intelligence. 2016.
8. Murray, G., Carenini, G., & Ng, R. (2012, June). Using the omega index for evaluating abstractive community detection. In Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization (pp. 10-18). Association for Computational Linguistics.
9. Shang, G., Ding, W., Zhang, Z., Tixier, A. J. P., Meladianos, P., Vazirgiannis, M., & Lorré, J. P. (2018). Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization. arXiv preprint arXiv:1805.05271, ACL 2018.
10. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. Hierarchical attention networks for document classification. In Proceedings of the 2016 NAACL, pp. 1480–1489. Association for Computational Linguistics.
11. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems (pp. 3104-3112).

References deep learning for graphs

12. F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. The Graph Neural Network Model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009.
13. Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel. Gated Graph Sequence Neural Networks. arXiv preprint arXiv:1511.05493, 2015.
14. J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl. Neural Message Passing for Quantum Chemistry. In Proceedings of the 34th ICML conference, pp. 1263–1272, 2017.
15. M. Zhang, Z. Cui, M. Neumann, and Y. Chen. An End-to-End Deep Learning Architecture for Graph Classification. In Proc. 32nd AAAI Conference on Artificial Intelligence, pp. 4438–4445, 2018.
16. C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019.
17. K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How Powerful are Graph Neural Networks? In Proceedings of the 7th International Conference on Learning Representations, 2019.

Pre-requisites

Good understanding of Algorithms, Machine and Deep learning.

Short bio

Dr. Vazirgiannis is a Distinguished Professor at Ecole Polytechnique, Institute Polytechnique de Paris in France. He has conducted research in Fraunhofer and Max Planck-MPI (Germany), in INRIA/FUTURS (Paris). He has been teaching data mining, machine and deep learning in AUEB (Greece), Ecole Polytechnique, Telecom-Paristech, ENS (France), Tsinghua, Jiaotong Shanghai (China) and in Deusto University (Spain). His current research interests are on deep and machine learning for graphs (including GNNs. community detection, graph classification, clustering and embeddings, influence maximization), text mining including graph of words, deep learning for NLP tasks and applications such as digital marketing, event detection and summarization. He has active cooperation with industrial partners in the area of data analytics and machine learning for large scale data repositories in different application domains. He has supervised more than twenty completed PhD theses, has published three books and more than 200 papers in international refereed journals and conferences and received best paper awards in ACM CIKM2013 and IJCAI2018. He has organized large scale conferences in the area of Data Mining and Machine Learning (such as ECML/PKDD) while he participated in the senior PC of AI and ML conferences – such as AAAI and IJCAI. He has received the ERCIM and the Marie Curie EU fellowships, the Rhino-Bird International Academic Expert Award by Tencent and he lead(s) AXA Data Science (2015 – 2018) and ANR-HELAS (2020-24) chairs. More information at the DASCIM web page: http://www.lix.polytechnique.fr/dascim and Google Scholar profile: https://bit.ly/2rwmvQU

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.