Nitesh Chawla

University of Notre Dame

[introductory/intermediate] Graph Representation Learning

Summary

Complex systems, such as web, information, or knowledge systems, are generally represented as (heterogeneous) networks, which are rich in their representation of the underlying characteristics and phenomena of the complex system. These representations include relationships, attributes, content, and temporal information. As such the networks are inherently heterogeneous (or multi-modal), presenting a significantly large exploratory space for manual feature construction for several downstream modeling tasks. This has led to development and popularization of representation learning algorithms for graphs / networks. In this tutorial, we will answer the following questions: What is representation learning on graphs? Why do we need it? How do we do it? How do we tackle the challenges of multiple data modalities? What are some of the applications?

Syllabus

Complex Systems as Graphs / Networks
Overview of Representation Learning
Learning node-based Embeddings: Homogeneous and Heterogeneous Methods
Graph Neural Networks

References

1) Cui, P., Wang, X., Pei, J., & Zhu, W. (2018). A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering, 31(5), 833-852.

2) Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584.

3) Liu, X., & Tang, J. (2021). Network representation learning: A macro and micro view. AI Open, 2, 43-64.

4) Dong, Y., Chawla, N. V., & Swami, A. (2017, August). metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 135-144).

5) Zhang, C., Song, D., Huang, C., Swami, A., & Chawla, N. V. (2019, July). Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 793-803).

6) Saebi, M., Ciampaglia, G. L., Kaplan, L. M., & Chawla, N. V. (2020). HONEM: learning embedding for higher order networks. Big Data, 8(4), 255-269.

Pre-requisites

Introductory machine learning, specifically learning on graphs, and network science.

Short bio

Nitesh Chawla is the Frank M. Freimann Professor of Computer Science and Engineering, and Founding Director of the Lucy Family Institute for Data and Society. His research is focused on machine learning, data science, and network science, and is motivated by the question of how technology can advance the common good through interdisciplinary research. He is the recipient of the IEEE CIS Outstanding Early Career Award; the IBM Watson Faculty Award, the IBM Big Data, and Analytics Faculty Award, National Academy of Engineering New Faculty Fellowship, and 1st Source Bank Technology Commercialization Award. In recognition of the societal and community impact of his research, he received the Rodney F Ganey Award and Michiana 40 under 40 honor. He is founder of Aunalytics, a data science software and cloud computing company.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.