TY - JOUR
T1 - A new self-supervised task on graphs: Geodesic distance prediction
AU - Peng, Zhen
AU - Dong, Yixiang
AU - Luo, Minnan
AU - Wu, Xiao Ming
AU - Zheng, Qinghua
N1 - Funding Information:
This study was supported by the National Key Research and Development Program of China (No.2020AAA0108800), National Natural Science Foundation of China (No.62192781, No.61872287, No.62137002, No.62050194, No.61937001), Innovative Research Group of the National Natural Science Foundation of China (No.61721002), Innovation Research Team of Ministry of Education (IRT_17R86), CCF-AFSG Research Fund, Project of China Knowledge Center for Engineering Science and Technology, and Project of Chinese academy of engineering “The Online and Offline Mixed Educational Service System for ’The Belt and Road’ Training in MOOC China”. Besides, this research was supported by the grants of projects P0001175 and P0030935 funded by PolyU (UGC).
Publisher Copyright:
© 2022 Elsevier Inc.
PY - 2022/8
Y1 - 2022/8
N2 - The heavy dependence on human-curated labels makes supervised algorithms hit a bottleneck in real applications. Fortunately, a more data-efficient learning paradigm—self-supervised learning (SSL) disrupts this dilemma. By creating pretext tasks, self-supervised algorithms learn from the data itself without external supervision. Until recently, SSL has yielded immense success in the image domain, but its research in graph mining has received relatively little scrutiny. In fact, rich topological links and affiliated attributes promise more potential for self-supervised graph learning. Thus, in this paper, we propose a new pretext task called geodesic distance prediction to guide node representation learning. Specifically, this task demands neural nets to learn to infer the geodesic distance of one node relative to the other. Our underlying hypothesis is that doing well on reasoning about the pairwise distance requires the model to extract relation rules between topological distances and attributes. In this way, the complex correlation between nodes can be measured with a simple distance scale. Experiments demonstrate that our S2GRL achieves competitive or better performance than many state-of-the-art self-supervised methods. A case study on distance inference shows that S2GRL can effectively evaluate the risk of business transactions or make recommendations from a topological view.
AB - The heavy dependence on human-curated labels makes supervised algorithms hit a bottleneck in real applications. Fortunately, a more data-efficient learning paradigm—self-supervised learning (SSL) disrupts this dilemma. By creating pretext tasks, self-supervised algorithms learn from the data itself without external supervision. Until recently, SSL has yielded immense success in the image domain, but its research in graph mining has received relatively little scrutiny. In fact, rich topological links and affiliated attributes promise more potential for self-supervised graph learning. Thus, in this paper, we propose a new pretext task called geodesic distance prediction to guide node representation learning. Specifically, this task demands neural nets to learn to infer the geodesic distance of one node relative to the other. Our underlying hypothesis is that doing well on reasoning about the pairwise distance requires the model to extract relation rules between topological distances and attributes. In this way, the complex correlation between nodes can be measured with a simple distance scale. Experiments demonstrate that our S2GRL achieves competitive or better performance than many state-of-the-art self-supervised methods. A case study on distance inference shows that S2GRL can effectively evaluate the risk of business transactions or make recommendations from a topological view.
KW - Geodesic distance
KW - Graph representation learning
KW - Self-supervised learning
KW - The shortest path distance
UR - http://www.scopus.com/inward/record.url?scp=85132700425&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2022.06.046
DO - 10.1016/j.ins.2022.06.046
M3 - Journal article
AN - SCOPUS:85132700425
SN - 0020-0255
VL - 607
SP - 1195
EP - 1210
JO - Information Sciences
JF - Information Sciences
ER -