TY - GEN
T1 - S2GAE: Self-Supervised Graph Autoencoders are Generalizable Learners with Graph Masking
AU - Tan, Qiaoyu
AU - Liu, Ninghao
AU - Huang, Xiao
AU - Choi, Soo Hyun
AU - Li, Li
AU - Chen, Rui
AU - Hu, Xia
N1 - Funding Information:
We thank the anomalous reviewers for the feedback. The work is, in part, supported by NSF (IIS-1849085, IIS-1750074, IIS-2006844).
Publisher Copyright:
© 2023 ACM.
PY - 2023/2/27
Y1 - 2023/2/27
N2 - Self-supervised learning (SSL) has been demonstrated to be effective in pre-training models that can be generalized to various downstream tasks. Graph Autoencoder (GAE), an increasingly popular SSL approach on graphs, has been widely explored to learn node representations without ground-truth labels. However, recent studies show that existing GAE methods could only perform well on link prediction tasks, while their performance on classification tasks is rather limited. This limitation casts doubt on the generalizability and adoption of GAE. In this paper, for the first time, we show that GAE can generalize well to both link prediction and classification scenarios, including node-level and graph-level tasks, by redesigning its critical building blocks from the graph masking perspective. Our proposal is called Self-Supervised Graph Autoencoder - S2GAE, which unleashes the power of GAEs with minimal yet nontrivial efforts. Specifically, instead of reconstructing the whole input structure, we randomly mask a portion of edges and learn to reconstruct these missing edges with an effective masking strategy and an expressive decoder network. Moreover, we theoretically prove that S2GAE could be regarded as an edge-level contrastive learning framework, providing insights into why it generalizes well. Empirically, we conduct extensive experiments on 21 benchmark datasets across link prediction and node & graph classification tasks. The results validate the superiority of S2GAE against state-of-the-art generative and contrastive methods. This study demonstrates the potential of GAE as a universal representation learner on graphs. Our code is publicly available at https://github.com/qiaoyu-tan/S2GAE.
AB - Self-supervised learning (SSL) has been demonstrated to be effective in pre-training models that can be generalized to various downstream tasks. Graph Autoencoder (GAE), an increasingly popular SSL approach on graphs, has been widely explored to learn node representations without ground-truth labels. However, recent studies show that existing GAE methods could only perform well on link prediction tasks, while their performance on classification tasks is rather limited. This limitation casts doubt on the generalizability and adoption of GAE. In this paper, for the first time, we show that GAE can generalize well to both link prediction and classification scenarios, including node-level and graph-level tasks, by redesigning its critical building blocks from the graph masking perspective. Our proposal is called Self-Supervised Graph Autoencoder - S2GAE, which unleashes the power of GAEs with minimal yet nontrivial efforts. Specifically, instead of reconstructing the whole input structure, we randomly mask a portion of edges and learn to reconstruct these missing edges with an effective masking strategy and an expressive decoder network. Moreover, we theoretically prove that S2GAE could be regarded as an edge-level contrastive learning framework, providing insights into why it generalizes well. Empirically, we conduct extensive experiments on 21 benchmark datasets across link prediction and node & graph classification tasks. The results validate the superiority of S2GAE against state-of-the-art generative and contrastive methods. This study demonstrates the potential of GAE as a universal representation learner on graphs. Our code is publicly available at https://github.com/qiaoyu-tan/S2GAE.
KW - masked autoencoders
KW - masked graph autoencoder
KW - self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85145456539&partnerID=8YFLogxK
U2 - 10.1145/3539597.3570404
DO - 10.1145/3539597.3570404
M3 - Conference article published in proceeding or book
AN - SCOPUS:85145456539
T3 - WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining
SP - 787
EP - 795
BT - WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining
PB - Association for Computing Machinery, Inc
T2 - 16th ACM International Conference on Web Search and Data Mining, WSDM 2023
Y2 - 27 February 2023 through 3 March 2023
ER -