Discovering Correlations between Sparse Features in Distant Supervision for Relation Extraction

Jianfeng Qu, Dantong Ouyang, Yuxin Ye, Wen Hua, Xiaofang Zhou

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

9 Citations (Scopus)

Abstract

The recent art in relation extraction is distant supervision which generates training data by heuristically aligning a knowledge base with free texts and thus avoids human labelling. However, the concerned relation mentions often use the bag-of-words representation, which ignores inner correlations between features located in different dimensions and makes relation extraction less effective. To capture the complex characteristics of relation expression and tighten the correlated features, we attempt to discover and utilise informative correlations between features by the following four phases: 1) formulating semantic similarities between lexical features using the embedding method; 2) constructing generative relation for lexical features with different sizes of side windows; 3) computing correlation scores between syntactic features through a kernel-based method; and 4) conducting a distillation process for the obtained correlated feature pairs and integrating informative pairs with existing relation extraction models. The extensive experiments demonstrate that our method can effectively discover correlation information and improve the performance of state-of-the-art relation extraction methods.

Original languageEnglish
Title of host publicationWSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining
PublisherAssociation for Computing Machinery, Inc
Pages726-734
Number of pages9
ISBN (Electronic)9781450359405
DOIs
Publication statusPublished - 30 Jan 2019
Externally publishedYes
Event12th ACM International Conference on Web Search and Data Mining, WSDM 2019 - Melbourne, Australia
Duration: 11 Feb 201915 Feb 2019

Publication series

NameWSDM 2019 - Proceedings of the 12th ACM International Conference on Web Search and Data Mining

Conference

Conference12th ACM International Conference on Web Search and Data Mining, WSDM 2019
Country/TerritoryAustralia
CityMelbourne
Period11/02/1915/02/19

Keywords

  • Bag-of-words representation
  • Distant supervision
  • Feature correlation
  • Lexical features
  • Syntactic features

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Discovering Correlations between Sparse Features in Distant Supervision for Relation Extraction'. Together they form a unique fingerprint.

Cite this