TY - GEN
T1 - Large-scale Entity Alignment via Knowledge Graph Merging, Partitioning and Embedding
AU - Xin, Kexuan
AU - Sun, Zequn
AU - Hua, Wen
AU - Hu, Wei
AU - Qu, Jianfeng
AU - Zhou, Xiaofang
N1 - Funding Information:
This work was partially supported by the Australian Research Council under Grant No. DE210100160 and DP200103650.
Publisher Copyright:
© 2022 ACM.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Entity alignment is a crucial task in knowledge graph fusion. However, most entity alignment approaches have the scalability problem. Recent methods address this issue by dividing large KGs into small blocks for embedding and alignment learning in each. However, such a partitioning and learning process results in an excessive loss of structure and alignment. Therefore, in this work, we propose a scalable GNN-based entity alignment approach to reduce the structure and alignment loss from three perspectives. First, we propose a centrality-based subgraph generation algorithm to recall some landmark entities serving as the bridges between different subgraphs. Second, we introduce self-supervised entity reconstruction to recover entity representations from incomplete neighborhood subgraphs, and design cross-subgraph negative sampling to incorporate entities from other subgraphs in alignment learning. Third, during the inference process, we merge the embeddings of subgraphs to make a single space for alignment search. Experimental results on the benchmark OpenEA dataset and the proposed large DBpedia1M dataset verify the effectiveness of our approach.
AB - Entity alignment is a crucial task in knowledge graph fusion. However, most entity alignment approaches have the scalability problem. Recent methods address this issue by dividing large KGs into small blocks for embedding and alignment learning in each. However, such a partitioning and learning process results in an excessive loss of structure and alignment. Therefore, in this work, we propose a scalable GNN-based entity alignment approach to reduce the structure and alignment loss from three perspectives. First, we propose a centrality-based subgraph generation algorithm to recall some landmark entities serving as the bridges between different subgraphs. Second, we introduce self-supervised entity reconstruction to recover entity representations from incomplete neighborhood subgraphs, and design cross-subgraph negative sampling to incorporate entities from other subgraphs in alignment learning. Third, during the inference process, we merge the embeddings of subgraphs to make a single space for alignment search. Experimental results on the benchmark OpenEA dataset and the proposed large DBpedia1M dataset verify the effectiveness of our approach.
KW - entity alignment
KW - graph neural networks
KW - large-scale knowledge graph
KW - representation learning
UR - http://www.scopus.com/inward/record.url?scp=85140854413&partnerID=8YFLogxK
U2 - 10.1145/3511808.3557374
DO - 10.1145/3511808.3557374
M3 - Conference article published in proceeding or book
AN - SCOPUS:85140854413
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2240
EP - 2249
BT - CIKM 2022 - Proceedings of the 31st ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 31st ACM International Conference on Information and Knowledge Management, CIKM 2022
Y2 - 17 October 2022 through 21 October 2022
ER -