TY - GEN
T1 - ActiveEA: Active Learning for Neural Entity Alignment
AU - Liu, Bing
AU - Scells, Harrisen
AU - Zuccon, Guido
AU - Hua, Wen
AU - Zhao, Genghong
N1 - Funding Information:
This research is supported by the Shenyang Science and Technology Plan Fund (No.20-201-410), the Member Program of Neusoft Research of Intelligent Healthcare Technology, Co.Ltd. (No. NRMP001901). Dr Wen Hua is the recipient of an Australian Research Council DECRA Research Fellowship (DE210100160). Dr Guido Zuccon is the recipient of an Australian Research Council DECRA Research Fellowship (DE180101579).
Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021/11
Y1 - 2021/11
N2 - Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods - neural EA models - rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate. In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment to obtain more effective EA models with less annotation cost. Our framework tackles two main challenges encountered when applying AL to EA: (1) How to exploit dependencies between entities within the AL strategy. Most AL strategies assume that the data instances to sample are independent and identically distributed. However, entities in KGs are related. To address this challenge, we propose a structure-aware uncertainty sampling strategy that can measure the uncertainty of each entity as well as its impact on its neighbour entities in the KG. (2) How to recognise entities that appear in one KG but not in the other KG (i.e., bachelors). Identifying bachelors would likely save annotation budget. To address this challenge, we devise a bachelor recognizer paying attention to alleviate the effect of sampling bias. Empirical results show that our proposed AL strategy can significantly improve sampling quality with good generality across different datasets, EA models and amount of bachelors.
AB - Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods - neural EA models - rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate. In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment to obtain more effective EA models with less annotation cost. Our framework tackles two main challenges encountered when applying AL to EA: (1) How to exploit dependencies between entities within the AL strategy. Most AL strategies assume that the data instances to sample are independent and identically distributed. However, entities in KGs are related. To address this challenge, we propose a structure-aware uncertainty sampling strategy that can measure the uncertainty of each entity as well as its impact on its neighbour entities in the KG. (2) How to recognise entities that appear in one KG but not in the other KG (i.e., bachelors). Identifying bachelors would likely save annotation budget. To address this challenge, we devise a bachelor recognizer paying attention to alleviate the effect of sampling bias. Empirical results show that our proposed AL strategy can significantly improve sampling quality with good generality across different datasets, EA models and amount of bachelors.
UR - http://www.scopus.com/inward/record.url?scp=85127385854&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
AN - SCOPUS:85127385854
T3 - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
SP - 3364
EP - 3374
BT - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
Y2 - 7 November 2021 through 11 November 2021
ER -