An Efficient and Effective Approach for Multi-fact Extraction from Text Corpus

Jianfeng Qu, Wen Hua, Dantong Ouyang, Xiaofang Zhou

Research output: Journal article publicationJournal articleAcademic researchpeer-review

1 Citation (Scopus)

Abstract

Relation extraction (RE) is a fundamental task with various real-world applications. Although significant progress has been achieved in this research field, it is still limited to single-fact extraction. In practice, however, people tend to describe multiple relations in a single sentence. Apparently, multi-fact extraction is more reasonable yet challenging due to the mixture of diverse information. To address this issue, we introduce a novel syntax-based model for multi-fact extraction. Specifically, we propose a relational-expressiveness-based pruning strategy to refine the dependency parsing tree of each sentence, and then incorporate the customized and simplified syntax information into sentence encoding via Graph Convolutional Networks. Besides, distance embeddings are developed in our model to inform the extractor of the status of each word regarding different entity pairs in a sentence based on its shortest dependency path to the entities of interest. In addition, we explore fine-grained pooling strategy to integrate various evidences for the relation extractor to make accurate predictions. We conduct extensive experiments on the publicly-available datasets, and the experimental results verify the superiority of our model for multi-fact extraction in terms of both effectiveness and efficiency.

Original languageEnglish
Pages (from-to)195-218
Number of pages24
JournalWorld Wide Web
Volume25
Issue number1
DOIs
Publication statusPublished - Jan 2022
Externally publishedYes

Keywords

  • Dependency parse tree
  • Graph convolutional networks
  • Multi-fact
  • Pruning strategy
  • Relation extraction

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'An Efficient and Effective Approach for Multi-fact Extraction from Text Corpus'. Together they form a unique fingerprint.

Cite this