TY - GEN
T1 - BoostER: Leveraging Large Language Models for Enhancing Entity Resolution
AU - Li, Huahang
AU - Li, Shuangyin
AU - Hao, Fei
AU - Zhang, Chen Jason
AU - Song, Yuanfeng
AU - Chen, Lei
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/5/13
Y1 - 2024/5/13
N2 - Entity resolution, which involves identifying and merging records that refer to the same real-world entity, is a crucial task in areas like Web data integration. This importance is underscored by the presence of numerous duplicated and multi-version data resources on the Web. However, achieving high-quality entity resolution typically demands significant effort. The advent of Large Language Models (LLMs) like GPT-4 has demonstrated advanced linguistic capabilities, which can be a new paradigm for this task. In this paper, we propose a demonstration system named BoostER that examines the possibility of leveraging LLMs in the entity resolution process, revealing advantages in both easy deployment and low cost. Our approach optimally selects a set of matching questions and poses them to LLMs for verification, then refines the distribution of entity resolution results with the response of LLMs. This offers promising prospects to achieve a high-quality entity resolution result for real-world applications, especially to individuals or small companies without the need for extensive model training or significant financial investment.
AB - Entity resolution, which involves identifying and merging records that refer to the same real-world entity, is a crucial task in areas like Web data integration. This importance is underscored by the presence of numerous duplicated and multi-version data resources on the Web. However, achieving high-quality entity resolution typically demands significant effort. The advent of Large Language Models (LLMs) like GPT-4 has demonstrated advanced linguistic capabilities, which can be a new paradigm for this task. In this paper, we propose a demonstration system named BoostER that examines the possibility of leveraging LLMs in the entity resolution process, revealing advantages in both easy deployment and low cost. Our approach optimally selects a set of matching questions and poses them to LLMs for verification, then refines the distribution of entity resolution results with the response of LLMs. This offers promising prospects to achieve a high-quality entity resolution result for real-world applications, especially to individuals or small companies without the need for extensive model training or significant financial investment.
KW - Entity Resolution
KW - Large Language Models
KW - Web Data Integration
UR - http://www.scopus.com/inward/record.url?scp=85194465934&partnerID=8YFLogxK
U2 - 10.1145/3589335.3651245
DO - 10.1145/3589335.3651245
M3 - Conference article published in proceeding or book
AN - SCOPUS:85194465934
T3 - WWW 2024 Companion - Companion Proceedings of the ACM Web Conference
SP - 1043
EP - 1046
BT - WWW 2024 Companion - Companion Proceedings of the ACM Web Conference
PB - Association for Computing Machinery, Inc
T2 - 33rd ACM Web Conference, WWW 2024
Y2 - 13 May 2024 through 17 May 2024
ER -